Professional Documents
Culture Documents
Networks
every edge e ∈ E has an associated weight we . A cut Energy smoothness term. The smoothness term provides
C ⊂ E is a partition of the vertices V of the graph G into a measure of the difference between two neighbouring pix-
two disjoint sets S,T where Vs ∈ S and Vt ∈ T . The cost els p, q ∈ I with labels lp , lq ∈ L respectively. Let Ip and
of each cut C is the sum of the weighted edges e ∈ C. The Iq be the intensity values in the observed data of the pixels
minimum cut problem can then be defined as finding the p, q ∈ I respectively. Similarly, let θp and θq be the ini-
cut with the minimum cost which can be achieved in near tial orientations for the two pixels recovered as explained in
polynomial-time. Section 4.2. We define a measure of the observed smooth-
ness between pixels p and q as
5.2. Labels
1 + (Ip − Iq )2
The binary case can easily be extended to a case of ∆p,q = (10)
multiple terminal vertices. We create two terminal ver- 2 − ||θp − θq ||)2
tices for foreground O and background B pixels for each In addition, we define a measure of smoothness for the
orientation θ for which 0 ≤ θ ≤ π. In our experi- global minimization. Let If (p) and If (q) be the intensity
ments, we have found that choosing the number of ori- values under a labeling f . Similarly, let θf (p) and θf (q) be
entation labels in the range Nθ = [2, 16] generates ac- the orientations under the same labeling. We define a mea-
ceptable results. Thus the set of labels L is defined to be sure of the smoothness between neighbouring pixels p, q un-
L = {Oθ1 , Bθ1 , Oθ2 , Bθ2 ..., OθNθ , BθNθ } with size |L| = der a labeling f as
2 ∗ Nθ .
1 + (If (p) − If (q) )2
∆
d p,q = (11)
5.3. Energy minimization function 2 − ||θf (p) − θf (q) ||2
Finding the minimum cut of a graph is equivalent to find- Using the smoothness measure defined for the observed
ing an optimal labeling f : Ip −→ L which assigns a label data and the smoothness measure defined for any given la-
l ∈ L to each pixel p ∈ I where f is piecewise smooth and beling we can finally define the energy smoothness term as
consistent with the original data. Thus, our energy function follows,
for the graph-cut minimization is given by X
Esmooth (f ) = V{p,q} (f (p), f (q)) (12)
E(f ) = Edata (f ) + λ ∗ Esmooth (f ) (6) {p,q}∈N
X w
Esmooth (f ) = Kp,q ∗ ∆
d p,q (13) yr = −(x − )sin(φ) + ycos(φ) (16)
{p,q}∈N
2
where φ is the orientation of the filter and 0 ≤ φ ≤ π and
∆2
where N is the set of neighbouring pixels, Kp,q = [e− 2∗σ2 ],
p,q
w is the distance between the peaks. The bi-modal filter is
and σ controls the smoothness uncertainty. Intuitively, if shown in Figure 5(a).
two neighbouring pixels p and q have similar intensity and
similar orientation in the observed data, then ∆p,q will be
small and thus there is a high propability of ∆ d p,q being
small. To summarize, the function E(f ) penalizes heavily
for severed edges between neighbouring pixels with similar
intensity and orientation, and vice versa.
An advantage of the proposed orientation-based segmen-
tation is that by incorporating orientation information in the (a) (b)
optimization process it ensures that linear segments are not
Figure 5. (a) The bi-modal filter Gb is applied to the classified
severed, even in the case where the color difference between curve features. (b) Red arrows: filter orientation (at peaks). Black
neighbouring pixels is relatively big. By using the clas- arrows: actual pixel orientation.
sified curve feature information to guide the segmentation
process we combine the fast computational times of graph-
cuts and the high-accuracy of the information derived using Bi-modal filters of different orientations φ and widths w
the perceptual grouping to produce results with better de- are applied to the classified curve features computed pre-
fined boundaries compared to traditional segmentation tech- viously as explained in Section 4. In order to overcome
niques as demonstrated in Figure 4. problems arising from the coindicidental presence of two
curve pixels along the filters’ peaks, orientation informa-
tion is used to weigh the response. This ensures that the
maximum response only occurs when both pixels have the
same orientation and are aligned to the filter’s orientation.
Figure 5(b) demonstrates the application of a bi-modal filter
to a point O. The orientations θL and θR of the left and right
road side points pL and pR respectively are used to scale the
(a) (b) (c) (d)
response. Thus, equation 14 becomes,
Figure 4. Comparison between traditional intensity- and
(x− w )2 2 (x+ w )2 2
orientation-based segmentation. (a) Original image. (b) Intensity- 1
−[ 2 r + yr ]
2 2 −[ 2 r + yr ]
2 2
σx σy σx σy
based segmentation. (c) Orientation-based segmentation. (d) Gb = p
2πσx σy
[cos(θL )e + cos(θR )e ]
green: only in intensity segmentation, blue: only in orientation In addition to the bi-modal filters, single mode gaussian
segmentation) filters are applied to the segmented binary image contain-
ing the road candidates. This ensures that the area between
any parallel lines is indeed a part of the road and therefore
should appear in the result of the segmentation.
6. Road Network Extraction and Modeling Single mode and bi-modal filters of different widths and
orientations are combined as Gt = Gb ∗ Gs and are used for
6.1. Road Centerline Extraction and Linearization
the extraction of centerline information. A point along the
The extraction of the road centerlines is performed using centerline of a road of orientation θR and width wR , will
a set of gaussian-based filters. A bi-modal filter is employed have a maximum response to a filter with the same or sim-
to detect parallel-lines and is defined as a mixture of gaus- ilar orientation and width. Thus, for each pixel we record
sian kernels given by, the filter parameters(orientation,width) for which it returns
a maximum response.
(x− w )2 y2 (x+ w )2 y2
2 r 2 r
1 −[ σx2 + σr2 ] −[ σx2 + σr2 ] Finally, the centerline response magnitudes are used as
Gb = p [e y +e y ] (14)
2πσx σy votes in an iterative Hough transform. This has the signifi-
cant advantage that no input parameters are required for the
where the (. . . )r subscript stands for a rotation operation Hough transform, such as number of peaks, minimum vote
such that thresholds, etc. therefore making the linearization process
w w entirely automatic. The result is a set of lines representing
(x − )r = (x − )cos(φ) + ysin(φ) (15) the segments of the road network as shown in the example
2 2
of Figure 6. The majority of the centerlines are correctly 5. Smoothing. The centerline vector is converted to dense
extracted automatically. However, some false positives still points. A snake is then used to refine the spatial po-
exist. sition of those points using the centerline magnitude
map(Figure 6(a)) as an external force.
References
[1] A. Barsi and C. Heipke. Artificial neural networks for the de-
tection of road junctions in aerial images. In ISPRS Archives,
Vol. XXXIV, Part 3/W8, Munich, 17.-19. Sept., 2003.
[2] A. Baumgartner, C. T. Steger, H. Mayer, W. Eckstein, and
H. Ebner. Automatic road extraction in rural areas. In ISPRS
Congress, pages 107–112, 1999.
(a) [3] Y. Boykov and M.-P. Jolly. Interactive graph cuts for optimal
boundary and region segmentation of objects in N-D images.
In ICCV, pages 105–112, 2001.
[4] S. Clode, F. Rottensteiner, and P. Kootsookos. Improving
city model determination by using road detection from lidar
data. In Inter. Archives of the PRSASI Sciences Vol. XXXVI -
3/W24, pp. 159-164, Vienna, Austria, 2005.
[5] F. Dell’Acqua, P. Gamba, and G. Lisini. Road extraction
aided by adaptive directional filtering and template match-
ing. In ISPRS Archives, Vol. XXXVI, Part 8/W27, Tempe,AZ,
14.-16. March., 2005.
[6] I. R. Fasel, M. S. Bartlett, and J. R. Movellan. A compari-
son of gabor filter methods for automatic detection of facial
landmarks. In International Conference on Automatic Face
and Gesture Recognition, pages 231–235, 2002.
[7] G. Guy and G. G. Medioni. Inference of surfaces, 3D curves,
(b) and junctions from sparse, noisy, 3D data. IEEE Trans. Pat-
tern Anal. Mach. Intell, 19(11):1265–1277, 1997.
[8] P. Hofmann. Detecting buildings and roads from ikonos data
using additional elevation information. In Dipl.-Geogr.
[9] I. Laptev, H. Mayer, T. Lindeberg, W. Eckstein, C. Steger,
and A. Baumgartner. Automatic extraction of roads from
aerial images based on scale space and snakes. Mach. Vis.
Appl, 12(1):23–31, 2000.
[10] G. Lisini, C. Tison, D. Cherifi, F. Tupin, and P. Gamba. Im-
proving road network extraction in high-resolution sar im-
ages by data fusion. In CEOS SAR Workshop 2004, 2004.
[11] R. Peteri, J. Celle, and T. Ranchin. Detection and extraction
of road networks from high resolution satellite mages. In
ICIP03, pages I: 301–304, 2003.
[12] Porikli and F. M. Road extraction by point-wise gaussian
(c) models. Technical report, MERL, July 2003. SPIE Al-
gorithms and Technologies for Multispectral, Hyperspectral
Figure 7. The result of an 2Kx2K urban area. (a) Centerline vec-
and Ultraspectral Imagery IX, Vol. 5093, pp. 758-764.
tors overlaid on original image. (b) Tracked road segments using
the automatically extracted width and orientation. Note the over- [13] B. Wessel. Road network extraction from sar imagery sup-
lap at junctions. (c) Road network using polygonal representation. ported by context information. In ISPRS Proceedings, 2004.
The overlaps are correctly handled by the boolean operations to [14] C. Zhang, E. Baltsavias, and A. Gruen. Knowledge-based
form properly looking intersections/junctions. image analysis for 3d road construction. In Asian Journal of
Geoinformatic 1(4), 2001.
[15] Q. Zhang and I. Couloigner. Automated road network extrac-
tion from high resolution multi-spectral imagery. In ASPRS
curve features to produce segmentations with better defined
Proceedings, 2006.
boundaries.
Finally, a set of gaussian-based filters were developed for