MCA

Digital Image Processing
Unit 9
Unit 9
Image Segmentation 1
Structure 9.1 Introduction Objectives 9.2 Detection of Discontinuities Point Detection Line Detection Edge Detection 9.3 Edge Linking and Boundary Detection Local Processing Global processing via the HOUGH transform Global processing via Graph-Theoretic Techniques 9.4 Summary 9.5 Terminal Questions 9.6 Answers
9.1 Introduction
Segmentation subdivides an image into its constituent regions or objects. The level to which the subdivision is carried depends on the problem being solved. That is, segmentation should stop when the objects of interest in an application have been isolated. Segmentation of nontrivial images is one of the most difficult tasks in images processing. Segmentation accuracy determines the eventual success or failure of computerized analysis procedures. For this reason, considerable care should be taken to improve the probability of rugged segmentation. Image segmentation algorithms generally are based on one of two basic properties of intensity values: discontinuity and similarity. In the first category, the approach is to partition an image based on abrupt changes in intensity, such as edges in an image. The principal approaches in the second category are based on partitioning an image into regions that are similar according to a set of predefined criteria.
Sikkim Manipal University
Page No. 142
Unit 9
Objectives In this unit we will learn Detection of Discontinuities Edge Linking and Boundary Detection
9.2 Detection of Discontinuities

There are three basic types of discontinuities in a digital image: Points, lines and edges. In practice, the most common way to look for discontinuities is to run a mask through the image. For the 3 x 3 mask shown in Fig. 9.1, this procedure involves computing the sum of products of the coefficients with the gray levels contained in the region encompassed by the mask. That is, the response of the mask at any point in the image R=
Figure 9.1: A general 3 x 3 mask
Where is the gray level of the pixel associated with mask coefficient As usual, the response of the mask is defined with respect to its center location. When the mask is centered on a boundary pixel, the response is computed by using the appropriated partial neighbor hood.
Sikkim Manipal University Page No. 143
Unit 9
9.2.1 Point Detection The detection of isolated points in an image is straightforward. Using the mask shown in Fig. 9.2, we say that a point has been detected at the location on which the mask is centered if
|R|>T
Where T is a nonnegative threshold and R is the response of the mask at any point in the image. Basically all that this formulation does is measure the weighted differences between the center point and its neighbors. The idea is that the gray level of an isolated point will be quite different from the gray level of its neighbors.
Figure 9.2: A mask used for detecting isolated points different from a constant background.
The mask in Fig. 9.2 is the same as the mask used for high-frequency spatial filtering. The emphasis here is strictly on the detection of points. That is, only differences that are large enough to be considered isolated points in an image are of interest. 9.2.2 Line Detection Line detection is an important step in image processing and analysis. Lines and edges are features in any scene, from simple indoor scenes to noisy terrain images taken by satellite. Most of the earlier methods for detecting lines were based on pattern matching. The patterns directly followed from the definition of a line. These pattern templates are designed with suitable coefficients and are applied at each point in an image. A set of such templates is shown in Fig. 9.3. If the first mask were moved around an image, it would respond more strongly to lines oriented horizontally. With constant background, the maximum response would result when the line
Unit 9
passed through the middle row of the mask. This is easily verified by sketching a simple array of 1s with a line of a different gray level running horizontally through the array. A similar experiment would reveal that the second mask in Fig. 9.3 responds best to lines oriented at +45 ; the third mask to vertical lines; and the fourth mask to lines in the 45 direction. These directions can also be established by noting that the preferred direction of each mask is weighted with larger coefficient i.e., 2 than other possible directions. Let R1, R2, R3 and R4 denote the responses of the masks in Fig. 9.3, from left to right, where the Rs are given by equation 9.2. Suppose that all masks are run through an image. If, at a certain point in the image, for all j I, that point is said to be more likely associated with a line in the direction of mask i. For example, if at a point in the image, for j = 2, 3, 4, that particular point is said to be more likely associated with a horizontal line.
Figure 9.3: Line masks (a) Horizontal (b)
(c) Vertical (d)
9.2.3 Edge Detection The most common approach for detecting meaningful discontinuities in gray level. We discuss approaches for implementing first-order derivative (Gradient operator), second-order derivative (Laplacian operator). Basic Formulation An edge is a set of connected pixels that lie on the boundary between two regions. An edge is a local concept whereas a region boundary, owing to the way it is defined, is a more global idea.
Page No. 145
Unit 9
We start by modeling an edge intuitively. This will lead us to formalism in which meaningful transitions in gray levels can be measured. Intuitively, an ideal edge has the properties of the model shown in Fig. 9.4(a). An ideal edge according to this model is a set of connected pixels.
Figure 9.4 (a) Model of an ideal digital edge. (b) Model of a ramp edge. The slope of the ramp is proportional to the degree of blurring in the edge.
In practice, optics, sampling, and other acquisition imperfections yield edges that are blurred, with the degree of blurring being determined by factors such as the quality of the image acquisition system, the sampling rate, and illumination conditions under which the image is acquired. As a result, edges are more closely modeled as having a ramp like profile, such as the one shown in Fig. 9.4(b). The slope of the ramp is inversely proportional to the degree of blurring in the edge. In this model, we no longer have a thin (one pixel thick) path. Instead, an edge point now is any point contained in the ramp, and an edge would then be a set of such points that are connected. The thickness is determined by the length of the ramp. The length is determined by the slope, which is in turn determined by the degree of blurring. Blurred edges tend to be thick and sharp; edges tend to be thin.
Unit 9
Fig. 9.4 (c) shows the image from which the close-up in Fig. 9.4 (b) was extracted. Fig. 9.4 (d) shows a horizontal gray-level profile of the edge between two regions. This figure also shows the first and second derivatives of the gray-level profile. The first derivative is positive at the points of transition into and out of the ramp as we move from left to right along the profile; it is constant for points in the ramp; and is zero in areas of constant gray level. The second derivative is positive at the transition associated with the light side of the edge, and zero along the ramp and in areas of constant gray level. The following are the two additional properties of the second derivative around an edge: It produces two values for every edge in an image (an undesirable feature) An imaginary straight line joining the extreme positive and negative values of the second derivative would cross zero near the midpoint of the edge (zero-crossing property)
Figure 9.4(c) Two regions separated by a vertical edge. (d) Detail near the edge, showing a gray-level profile, and the first and second derivatives of the profile.
Page No. 147
Unit 9
9.2.3.1 Gradient Operators First-order derivatives of a digital image are based on various approximations of the 2-D gradient. The gradient of an image f(x, y) at location (x, y) is defined as the vector.
(1)
It is well known from vector analysis that the gradient vector points in the direction of maximum rate of change of f at coordinates (x, y). An important quantity in edge detection is the magnitude of this vector, denoted , where = mag ( = (2)
This quantity gives the maximum rate of increase of f(x, y) per unit distance in the direction of . It is common practice to refer to also as the gradient. We will adhere to convention and also use this term interchangeably, differentiating between the vector and its magnitude only in cases in which confusion is likely. The direction of the gradient vector also is an important quantity. Let (x, y) represent the direction angle of the vector at (x, y). Then, from vector analysis,
(x, y) =
(3)
where the angle is measured with respect to the x-axis. The direction of an edge at (x, y) is perpendicular to the direction of the gradient vector at that point. Computation of the gradient of an image is based on obtaining the partial derivatives and at every pixel location. Let the 3 x 3 area shown in Fig. 9.5(a) represent the gray levels in a neighborhood of an image. One of the simplest ways to implement a first-order partial derivative at point is to use the following Roberts cross-gradient operators:
Page No. 148
Unit 9
(4) and (5) These derivatives can be implemented for an entire image by using the masks shown in Fig. 9.5(b). Masks of size 2 x 2 are awkward to implement because they do not have a clear center. An approach using masks of size 3 x 3 is given by
(6)
and (7) In this formulation, the difference between the first and third rows of the 3 x 3 image region approximates the derivative in the x-direction, and the difference between the third and first columns approximates the derivative in the y-direction. The masks shown in Fig. 9.5 (d) and (e), called the prewitt operators, can be used to implement these two equations. A slight variation of these two equations uses a weight of 2 in center coefficient:
(8)
and (9) A weight value of 2 is used achieve some smoothing by giving more importance to the center point. Fig.s 9.5(f) and (g), called the sobel operators, are used to implement these two equations, where the Zs are the gray of the pixels overlapped by the masks at any location in an image. Computation of the gradient at the location of the center of the masks then utilizes equation (2), which gives one value of the gradient. To get the next value, the masks are moved to the next pixel location and the procedure is repeated. Thus after the procedure has been completed for all possible locations, the result is gradient image of the same size as the original image. As usual, mask operations on the border of an image are implemented by using the appropriate partial neighborhoods.
Unit 9
Figure 9.5: A 3 x 3 region of an image (the zs are gray level values) and various masks used to compute the gradient at point labeled .
9.2.3.2 Laplacian Operators The Laplacian of a 2-D function f(x, y) is a second-order derivative defined as f= + (10)
For a 3 x 3 region, the form most frequently encountered in practice is f= (11)
where the Z is the gray level of the pixel associated with the mask coefficient. The basic requirement in defining the digital Laplacian is that the coefficient associated with the center pixel is positive and the coefficients associated with the outer pixels are negative. Because the Laplacian is derivative, the sum of the coefficients has to be zero. Hence, the response is zero whenever the point in question and its neighbors have the same
Page No. 150
Unit 9
value. Fig. 9.6 shows a spatial mask that can be used to implement equation (11).
Figure 9.6: Masks used to compute the Laplacian
Although the Laplacian responds to transitions in intensity, it is seldom used in practice for edge detection for several reasons. As a second-order derivative, the Laplacian typically is unacceptably sensitive to noise. Moreover, the Laplacian produces double edges and is unable to detect edge direction. For these reasons, the Laplacian usually plays the secondary role of detector for establishing whether a pixel is on the dark or light side of an edge. A more general use of the Laplacian is in finding the location of edges using its zero-crossing property. This concept is based on convolving an image with the Laplacian of a 2-D Gaussian function of the form
h (x, y) = exp
Where is the standard deviation. .
(12)
Then, from equation (10) the Laplacian of h is
h=
(13)
Page No. 151
Unit 9
Figure 9.7: Cross section showing zero crossings
Fig. 9.7 shows a cross section of this circularly symmetric function. Note that the smoothness of the function, its zero crossings at r = , and the positive center and negative skirts. This shape is the model upon which equation (11) and the masks in Fig. 9.9 are based. When viewed in 3-D perspective with the vertical axis corresponding to intensity, equation (13) has a classical Mexican hat shape. It can be shown that the average value of the Laplacian operator h is zero. The same is true of a Laplacian image by convolving this operator with a given image. Convolution of an image with a function of the form, blurs the image, with the degree of blurring being proportional to . Although this property has value in terms of noise reduction, the usefulness of equation (13) lies in its zero crossings. The preceding discussion has one further implication. Edge detection by gradient operations tends to work well in cases involving images with sharp intensity transitions and relatively low noise. Zero crossings offer an alternative in cases when edges are blurry or when high noise content is present. The zero crossings offer reliable edge location, and the smoothing properties of h reduce the effects of noise. The price paid for these advantages is increased computational complexity and time.
9.3 Edge Linking and Boundary Detection

Edge linking and boundary detection operations are the fundamental steps in any image understanding. Edge linking process takes an unordered set of edge pixels produced by an edge detector as an input to form an ordered list of edges. Local edge information are utilized by edge linking operation; thus
Unit 9
edge detection algorithms typically are followed by linking procedure to assemble edge pixels into meaningful edges. 9.3.1 Local Processing One of the simplest approaches of linking edge points is to analyze the characteristics of pixels in a small neighborhood (say, 3 x 3 or 5 x 5) about every point (x, y) in an image that has undergone edge-detection. All points that are similar are linked, forming a boundary of pixels that share some common properties. The two principal properties used for establishing similarity of edge pixels in this kind of analysis are (1) the strength of the response of the gradient operator used to produce the edge pixel; and (2) the direction of the gradient vector. The first property is given by the value of , the gradient. Thus an edge pixel with coordinates in a predefined neighborhood of (x, y) is similar in magnitude to the pixel at (x, y) if
where E is a non negative threshold. The direction (angle) of the gradient vector is given by . An edge pixel at in the predefined neighborhood of (x, y) has an angle similar to the pixel at (x, y) if
where A is a nonnegative angle threshold. As noted in 9.2.3, the direction of the edge at (x, y) is perpendicular to the direction of the gradient vector at that point. A point in the predefined neighborhood of (x, y) is linked to the pixel at (x, y) if both magnitude and direction criteria are satisfied. This process is repeated at every location in the image. A record must be kept of linked points as the center of the neighborhood is moved from pixel to pixel. A simple book keeping procedure is to assign a different gray level to each set of linked edge pixels.
Page No. 153
Unit 9
9.3.2 Global Processing via the Hough transform In this section, points are linked by determining first if they lie on a curve of specified shape. Unlike the local analysis method, we now consider global relationships between pixels. Suppose that, for n points in an image, we want to find subsets of these points that lie on straight lines. One possible solution is to first find all lines determined by every pair of points and then find all subsets of points that are close to particular lines. The problem with this procedure is that it involves finding n(n-1)/2 lines and then performing (n)(n(n-1))/2 comparisons of every point to all lines. This approach is computationally prohibitive in al but the most trivial applications.
Figure 9.8: (a) xy plane; (b) parameter space
Hough proposed an alternative approach, commonly referred to as the Hough transform. Consider a point and the general equation of a straight line in slope-intercept form, for varying values of a and b. Infinitely many lines pass through but they all satisfy the equation for for varying values of a and b. However, writing this equation as and considering the ab plane (also called parameter space) yields the equation of a single line for a fixed pair . Further more, a second point
also has a line in parameter space

Page No. 154
Unit 9
associated with it, and this line intersects the line associated with and (a, b), where a is the slope and b the intercept of the line containing both in the xy plane. In fact, all points contained on this line have lines in parameter space that intersect at (a, b). Fig. (9.8) shows these concepts The computational attractiveness of the Hough transform arises from subdivision of the parameter space into so called accumulator cells, as illustrated in Fig. 9.9, where ( ) and ( ) are the expected ranges of slope and intercept values. The cell at coordinates (i, j), with accumulator value A (i, j), corresponds to the square associated with parameter space coordinates . Initially, these cells are set to zero. Then, for every point in the image plane, we let the parameter a equal each of the allowed subdivision values on the a axis and solve for the corresponding b using the equation . The resulting bs are then rounded off to the nearest allowed value in the b axis. If a choice of results in solution we let A (p, q) = A (p, q) + 1. At the end of this procedure, a value of M in A (i, j) corresponds to M points in the xy plane lying on the line . The accuracy of the co linearity of these points is determined by the number of subdivisions in the ab plane.
Figure 9.9: Quantization of the parameter plane for use in the Hough transform Sikkim Manipal University Page No. 155
Unit 9
Note that subdividing the a axis into k increments gives, for every point , k values of b corresponding to the k possible values of a. With n image points, this method involves computations.
A problem with using the quantity y = ax + b to represent a line is that both the slope and intercept approach infinity as the line approaches the vertical. One way around this difficulty is o use the normal representation of a line; (14) Fig. 9.10(a) shows the meaning of the parameters used in equation (14). Then use of this representation in constructing a table of accumulators is identical to the method discussed for the slope-intercept representation. Instead of straight lines, however, the loci are sinusoidal curves in the plane. As before, M collinear points lying on a line x in
yields M sinusoidal curves that intersect at
the parameter space. Incrementing and solving for the corresponding gives M entries in accumulator A (i, j) associated with the cell determined by . Fig. 9.10(b) illustrates the subdivision of the parameter space. The range of angle is , measured with respect to the x-axis. Thus , with P being , with being equal , with
with reference to Fig. 9.10(a), a horizontal line has being equal to the positive y intercept, or to the negative y intercept.
equal to the positive x intercept. Similarly, a vertical line has
Although the focus so far has been on straight lines, the Hough transform is applicable to any function of the form g (v, c) = 0, where v is a vector of coordinates and c is a vector of coefficients. For example, the points lying on the circle (15) can be detected by using the above approach.
Page No. 156
Unit 9
Figure 9.10: (a) Normal representation of a line; (b) Quantization of the P0 plane into cells
The basic difference is the presence of three parameters ( ) which results in a 3D parameter space with cube like cells and accumulators of the form A (i, j, k). The procedure is to increment , solve for the that satisfies equation (15), and update that accumulator corresponding to the cell associated with the triplet ( ). Clearly, the complexity of the Hough transform is strongly dependent on the number of coordinates and coefficients in a given functional representation. Further generalizations of the Hough transform to detect curves with no simple analytic representations are possible. We now return to the edge-linking problem. An approach based on the Hough transform is as follows: 1. Compute the gradient of an mage and threshold it to obtain a binary image. 2. Specify subdivisions in the plane. 3. Examine the counts of the accumulator cells for high pixel concentrations. 4. Examine the relationship between pixels in a chosen cell. The concept of continuity in this case usually is based on computing the distance between disconnected pixels identified during traversal of the set of pixels corresponding to a given accumulator cell. A gap at any point is
Unit 9
significant if the distance between that point and its closest neighbor exceeds a certain threshold. 9.3.3 Global Processing via Graph-Theoretic Techniques In this section, a global approach based on representing edge segments in the form of a graph and searching the graph for low-cost paths that correspond to significant edges is discussed. This representation provides a rugged approach that performs well in the presence of noise. As might be expected, the procedure is considerably more complicated and requires more processing time. A graph G = (N, A) is a finite, non empty set of nodes N, together with a set A of unordered pair of distinct elements of N. Each pair of A is called an arc. A graph in which the arcs are directed is called a directed graph. If an arc is directed from node to node , then is said to be a successor of its parent node . The process of identifying the successors of a node is called expansion of the node. In each graph we define levels, such that level consists of a single node, called the start node, and the nodes in the last level are called goal nodes. associated with every arc A cost can be with , and
. A sequence of nodes is called a path from
each being a successor of node the cost of the path is
C=
(16)
Finally, an edge element is the boundary between to pixels p and q, such that p and q are 4-neighbors, as Fig. (9.11) illustrates. In this context, an edge is a sequence of edge elements.
Page No. 158
Unit 9
Figure 9.11: Edge element between pixels P and Q
We can illustrate how the foregoing concepts apply to edge detection with the 3 x 3 image shown in Fig. (9.11(a)), where the outer numbers are pixel coordinates and the numbers in parentheses represent intensity. Each edge element defined by pixels p and q has an associated cost, defined as c (p, q) =
(17)
where H is the highest intensity value in the image, f(p) is the intensity value of p, and f(q) is the intensity value of q. As indicated earlier, p, and q are 4-neighbours.
Figure 9.11 (a): A 3 x 3 image
Page No. 159
Unit 9
Fig. (9.11(b)) shows the graph for this problem; each node corresponds to an edge element, and an arc exists between two nodes if the two corresponding edge elements taken in succession can be part of an edge. The cost of each edge element, computed by using equation (17), is the arc leading into it, and goal nodes are shown as shaded rectangles. Each path between the start node and a goal node is a possible edge. For simplicity, the edge is assumed to start in the top row and terminate in the last row, so that the first element of an edge can be only [(0, 0), (0, 1)] or [(0, 1), (0, 2)] and the last element [(2, 0), (2, 1), (2, 2)]. The dashed lines represent the minimum cost path, computed by using equation (16). Fig. (9.11(c)): shows the corresponding edge.
Figure 9.11(b): Graph used for finding an edge in the image of 9.11(a). The pair (a, b) (c, d) in each box refers to points p and q respectively. Sikkim Manipal University Page No. 160
Unit 9
In general, the problem of finding a minimum cost path is not trivial in terms of computation. Typically, the approach is to sacrifice optimality for the sake of speed, and the following algorithm represents a class of procedures that use heuristics in order to reduce the search effort. Let r(n) be an estimate of the cost of a minimum-cost path from the start node s to a goal node, where the path is constrained to go through n. This cost can be expressed as the estimate of the cost of a minimum-cost path from s to n plus an estimate of the cost of that path from n to a goal node; that is r(n) = g(n) + h(n)
Figure 9.11 (c) Edge corresponding to the minimum cost path in 9.11 (a)
Here, g(n) can be chosen as the lowest cost path from s to n found so far, and h(n) is obtained by using any available heuristic information. An algorithm that uses r(n) as the basis for performing a graph search is as follows : 1. Mark the start node OPEN and set g(s) = 0. 2. If no node is OPEN exit with failure; otherwise, continue. 3. Mark CLOSED the OPEN node n whose estimate r(n) computed from Equation(9.3.3-2) is smallest. 4. If n is a goal node, exit with the solution path obtained by tracing back through the pointers; otherwise, continue. 5. Expand node n, generating all of its successors. 6. If a successor is not marked, set r( ) = g(n) + c(n, ) mark it OPEN, and direct pointers from it back to n.
Page No. 161
Unit 9
7. If a successor is marked CLOSED or OPEN, update its value by letting g(n) = min[g( ), g (n) + c(n, )] Mark OPEN those CLOSED successors whose g values were thus lowered and redirect to n the pointers form all nodes whose g values were lowered. Go to step 2. In general, this algorithm does not guarantee a minimum cost path: its advantage is speed via the use of heuristics. However, if h(n) is a lower bound on the cost of the minimal-cost path from node n to a goal node, the procedure indeed yields an optimal path to a goal. If no heuristic information is available (i.e., h = 0), the procedure reduces to the uniform-cost algorithm of Dijkstra.
9.4 Summary
Image segmentation is an essential preliminary step in most automatic pictorial pattern recognition and scene analysis problems. Segmentation subdivides an image into its constituent regions or objects. There are three basic types of discontinuities in a digital image: Points, lines and edges. Self Assessment Questions 1. _________ subdivides an image into its constituent regions or objects. 2. Image segmentation algorithms generally are based on one of two basic properties of intensity values: ________and ________. 3. _______, _______and_______ are the three basic types of discontinuities. 4. _________ are designed with suitable coefficients and are applied at each point in an image. 5. __________ Edge linking process takes an unordered set of edge pixels produced by an edge detector as an input to form an ordered list of edges.
9.5 Terminal Questions

1. 2. 3. 4. Explain about the detection of discontinuity. Explain the concepts of point and line detection. Explain Global processing Via Graph-Theoretic. Explain about the Local Processing.
Page No. 162
Unit 9
9.6 Answers
Self Assessment Questions 1. Segmentation 2. discontinuity, similarity 3. point, lines and edges 4. Mask 5. Edge linking Terminal Questions 1. Refer Section 9.2 2. Refer Section 9.2.1 & 9.2.2 3. Refer Section 9.3.3 4. Refer Section 9.3.1
Page No. 163

MCA

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MCA

Uploaded by

Copyright:

Available Formats

Digital Image Processing

Sikkim Manipal University

Page No. 142

Digital Image Processing

9.2 Detection of Discontinuities

Figure 9.1: A general 3 x 3 mask

Digital Image Processing

Digital Image Processing

Figure 9.3: Line masks (a) Horizontal (b)

(c) Vertical (d)

Sikkim Manipal University

Page No. 145

Digital Image Processing

Digital Image Processing

Sikkim Manipal University

Page No. 147

Digital Image Processing

Sikkim Manipal University

Page No. 148

Digital Image Processing

Digital Image Processing

For a 3 x 3 region, the form most frequently encountered in practice is f= (11)

Sikkim Manipal University

Page No. 150

Digital Image Processing

Figure 9.6: Masks used to compute the Laplacian

Then, from equation (10) the Laplacian of h is

Sikkim Manipal University

Page No. 151

Digital Image Processing

Figure 9.7: Cross section showing zero crossings

9.3 Edge Linking and Boundary Detection

Digital Image Processing

Sikkim Manipal University

Page No. 153

Digital Image Processing

Figure 9.8: (a) xy plane; (b) parameter space

also has a line in parameter space

Digital Image Processing

Digital Image Processing

yields M sinusoidal curves that intersect at

equal to the positive x intercept. Similarly, a vertical line has

Sikkim Manipal University

Page No. 156

Digital Image Processing

Digital Image Processing

. A sequence of nodes is called a path from

each being a successor of node the cost of the path is

Sikkim Manipal University

Page No. 158

Digital Image Processing

Figure 9.11: Edge element between pixels P and Q

Figure 9.11 (a): A 3 x 3 image

Sikkim Manipal University

Page No. 159

Digital Image Processing

Digital Image Processing

Sikkim Manipal University

Page No. 161

Digital Image Processing

9.5 Terminal Questions

Sikkim Manipal University

Digital Image Processing

Sikkim Manipal University