You are on page 1of 13

c   c

 
›  ›


a 

 a
   

   a  

TO

DR. SUMANA GUPTA

DEPARTMENT OF ELECTRICAL ENGINEERING,

IIT, KANPUR



 c c 

d. Brief Introduction
2. Theory
u General K Means algorithm
u Previous Work
3. Approach
u Adaptation K Means
4. Discussion
u Simulation
u Our contribution
5. Conclusion
6. References
 


  

An improved version of K-means algorithm for image segmentation is


presented in this report which works on reducing the numbers of
iteration required for the algorithm to converge. This is done by first
estimating the number of cluster using a set of different frame sizes
of the same image and then the convergence steps of the algorithm
are modified. Method is tested on different images and the result
show that the proposed method is efficient compared to standard K-
means.
c!
Image segmentation is the process of separating out mutually
exclusive homogeneous regions of interest in an image.
The goal of segmentation is to simplify and/or change the
representation of an image into something that is more meaningful
and easier to analyze. Image segmentation is typically used to locate
objects and boundaries (lines, curves, etc.) in images.
More precisely, image segmentation is the process of assigning a
label to every pixel in an image such that pixels with the same label
share certain visual characteristics.
The result of image segmentation is a set of segments that
collectively cover the entire image, or a set of contours extracted
from the image (see edge detection).
Each of the pixels in a region are similar with respect to some
characteristic or computed property, such as color, intensity, or
texture. Adjacent regions are significantly different with respect to
the same characteristic(s).
For intensity images (i.e., those represented by point-wise intensity
levels) four popular approaches are: threshold techniques, edge-
based methods, region-based techniques, and connectivity-
preserving relaxation methods.
  !"#, which make decisions based on local pixel
information, are effective when the intensity levels of the objects fall
squarely outside the range of levels in the background. Because
spatial information is ignored, however, blurred region boundaries
can create havoc.
$%&' ( centre around contour detection: their
weakness in connecting together broken contour lines make them,
too, prone to failure in the presence of blurring.
 $! %&' ( usually proceeds as follows: the image is
partitioned into connected regions by grouping neighbouring pixels
of similar intensity levels. Adjacent regions are then merged under
some criterion involving perhaps homogeneity or sharpness of region
boundaries.
Over stringent criteria create fragmentation; lenient ones overlook
blurred boundaries and over merge. Hybrid techniques using a mix of
the methods above are also popular.
  !)!*%+)! $ ','! %&' $( '!  (,
usually referred to as the active contour model, was proposed
recently.
The main idea is to start with some initial boundary shape
represented in the form of spline curves, and iteratively modify it by
applying various shrink/expansion operations according to some
energy function.
Although the energy-minimizing model is not new, coupling it with
the maintenance of an ``elastic'' contour model gives it an interesting
new twist. As usual with such methods, getting trapped into a local
minimum is a risk against which one must guard; this is no easy task.



à        

  "
e y  ce e se  
 s c   y e s e

d à  # !  ! *


S e s  vve 
e
G Ú      y ss  se ves  e
cs e
ce e
s
G F
ec    å  cc e  s ee
s ve
 å    cs e
s    s e  å  
G F
ec cs e
ce e

ecc e  s c 
   cc   s å ss e  s cs e

cc
  e  e 
ess *

G Repeat steps 2 and 3 until some termination criteria
are met.

 a' '- ('  a #! $'$!(-

K-means is one of the simplest unsupervised learning algorithms that


solve the well known clustering problem. The procedure follows a
simple and easy way to classify a given data set through a certain
number of clusters (assume k clusters) fixed a priori.

The main idea is to define ͞k͟ centroids, one for each cluster either
randomly or using a training set from the image itself. The next step
is to take each point belonging to a given data set and associate it to
the nearest centroid. When no point is pending, the first step is
completed and an early grouping is done. At this point we need to
re-calculate ͞k͟ new centroids as mean of the clusters resulting from
the previous step. After we have these ͞k͟ new centroids, a new
binding has to be done between the same data set points and the
nearest new centroid. A loop has been generated. As a result of this
loop we may notice that the k centroids change their location step by
step until no more changes are done. In other words centroids do
not move any more.
Finally, this algorithm aims at minimizing an ½   
 ½ , in this
case a squared error function. The objective function

where is a chosen distance measure between a data point


and the cluster centre , is an indicator of the distance of the
data points from their respective cluster centres.
The algorithm is composed of the following steps:

d. |  ½   ½    ½ 


 
 ½     ½
 ½
2. Y ½ ½½
 ½ ½
3.   ½     
 ½½ ½ 
  ½
4.    
  ½ ½ ½  ½
½
 ½ ½ ½  ½½
 ½ 

½        


Although it can be proved that the procedure will always terminate,


the k-means algorithm does not necessarily find the most optimal
configuration, corresponding to the global objective function
minimum. The algorithm is also significantly sensitive to the initial
randomly selected cluster centres. The k-means algorithm can be run
multiple times to reduce this effect.

 

 #$

As mentioned earlier modification has been done at initialization and


convergence with respect to standard K-means algorithm. The steps
of the algorithm are:
d) Initialization is done by reading the data as 2D matrix from
frame sizes for Fd 5d2X5d2, F2 256X256, F3 d28Xd28 , F4
64X64, F5 32X32, F6 d6Xd6, F7 8X8, F8 4X4 and calculating the
mean for every frame. The means were stored in ascending
order in an array.
2) The repeated values are singularized. The number of element in
the resultant array are the number of clusters while the values
of means are the centroid of the respective cluster used to
initialize the algorithm.
3) The distance of the pixel values from the centroid values
obtained in the previous step are calculated and then clustering
is done on the minimum distance basis.
4) Based on this clustering, the clusters center (Ci) are formed by
taking the mean of the pixel values in the respective cluster.
5) Mean of these new centers is then calculated and stored in a
variable.
6) Based on the minimum distance criteria pixels of the image
were again clusterized.
7) New centres (Ci+d) were obtained and the difference between
Ci and Ci+d was compared with the ͞mean͟ obtained in step 5).
8) If the difference comes out greater then the mean then the
new centre were retained as Ci otherwise the average of Ci and
Ci+d is taken. This is done for every cluster. If none of the
centers shifts then the clustering is done. In case any one is
shifted steps from 3) to 8) are repeated. The above steps are
shown as flow chart :
Based on the clusters obtained at the end of the algorithm , a
classified mask image is constructed by assigning the centriod
indexes to the pixel intensities based on which cluster they belong
to, i.e. for example we have 3 clusters, their respective centers
denoted by Cd, C2, and C3. Then the pixels belonging to cluster
number d are assigned an intensity value of d and a similar approach
is adopted for pixels belonging to the other two clusters.

The classified image thus obtained gives a segmented image of the


original image.
Ú 


The algorithm was simulated for five standard images [3]. The
comparison is done among standard K-means[SKM], adaptation K ʹ
means [AKM] and the variance and median version of the proposed
AKM for the number of iteration of the respective algorithm. The
same is show as bar graph:-
25

20

d5 SKM
AKM
d0 VARIANCE_AKM
MEDIAN_AKM
5

0
TRUCK MOON FISH MULTIFISH BABY

 %
G In addition to the proposed mean criteria we also tried the
proposed method with median and variance criteria. This is
done at the decision step 7) of the algorithm. Instead of
comparing the distance between the Ci and Ci+d with the
mean, we compared the same with the median and the
variance of the centres.
G As is visible in the graph the number of iterations in
Adaptation K-means, as proposed in the paper, are
significantly reduced over standard K- means. The same can
be said for the median and variance based variations of the
AKM as simulated by us but the results with them are never
better than the mean based version.


 
The method proposed in the paper is promising for a large collection
of images. It shows significant improvement over SKM as far as the
number of iterations required for the algorithm is concerned( which
in turn determines the speed of the algorithm, which is very
important for real-time applications). The median and variance
variations of the proposed AKM are also better than the SKM but
there is little difference in the number of iterations compared to
proposed AKM.

 

G Digital Image Processing Using Matlab - Gonzalez Woods &


Eddins
G B. Jeon, Y. Yung and K. Hong ͟Image segmentation by
unsupervised sparse clustering, ͟ pattern recognition letters
27science direct,(2006) d650-d664.
G http://sipi.usc.edu/database/(Database for standard image
processing images)

You might also like