You are on page 1of 13

K-Means Clustering

CMPUT 615
Applications of Machine Learning
in Image Analysis

K-means Overview
A clustering algorithm
An approximation to an NP-hard combinatorial
optimization problem
It is unsupervised
K stands for number of clusters, it is a user
input to the algorithm
From a set of data points or observations (all
numerical), K-means attempts to classify them
into K clusters
The algorithm is iterative in nature

K-means Details

X1,, XN are data points or vectors or observations

Each observation will be assigned to one and only one cluster

C(i) denotes cluster number for the ith observation

Dissimilarity measure: Euclidean distance metric

K-means minimizes within-cluster point scatter:


1 K
W (C ) xi x j
2 k 1 C ( i ) k C ( j ) k

Nk

where
mk is the mean vector of the kth cluster
Nk is the number of observations in kth cluster

k 1

x m

C (i )k

K-means Algorithm
For a given assignment C, compute the cluster means
mk:
xi
mk

i:C ( i ) k

Nk

, k 1, , K .

For a current set of cluster means, assign each


observation as:
2

C (i ) arg min xi mk , i 1, , N
1 k K

Iterate above two steps until convergence

Image Segmentation Results

An image (I)

Three-cluster image (J) on


gray values of I

Matlab code:
I = double(imread('));
J = reshape(kmeans(I(:),3),size(I));

Note that K-means result is noisy

Summary
K-means converges, but it finds a local minimum
of the cost function
Works only for numerical observations (for
categorical and mixture observations, K-medoids
is a clustering method)
Fine tuning is required when applied for image
segmentation; mostly because there is no
imposed spatial coherency in k-means algorithm
Often works as a staring point for sophisticated
image segmentation algorithms

Otsus Thresholding Method

(1979)

Based on the clustering idea: Find the


threshold that minimizes the weighted withincluster point scatter.
This turns out to be the same as maximizing
the between-class scatter.
Operates directly on the gray level histogram
[e.g. 256 numbers, P(i)], so its fast (once the
histogram is computed).

Otsus Method
Histogram (and the image) are bimodal.
No use of spatial coherence, nor any other
notion of object structure.
Assumes uniform illumination (implicitly), so
the bimodal brightness behavior arises from
object appearance differences only.

Theweightedwithinclassvarianceis:

(t) q1 (t) (t) q2 (t) (t)


2
w

2
1

2
2

Wheretheclassprobabilitiesareestimatedas:
t

q1 (t) P(i)

q2 (t)

i 1

P(i)

i t 1

Andtheclassmeansaregivenby:
t

iP(i)
1 (t)
i 1 q1 (t)

iP(i)
2 (t)
i t 1 q2 (t )

Finally,theindividualclassvariancesare:
t

P(i)
(t) [i 1 (t)]
q1 (t)
i1
2
1

P(i)
(t) [i 2 (t)]
q2 (t)
i t 1
2
2

Now,wecouldactuallystophere.Allweneedtodoisjust
runthroughthefullrangeoftvalues[1,256]andpickthe
w2 (t)
valuethatminimizes.
Buttherelationshipbetweenthewithinclassandbetween
classvariancescanbeexploitedtogeneratearecursion
relationthatpermitsamuchfastercalculation.

Finally...
Initialization...

q1 (1) P(1) ; 1 (0) 0

Recursion...
q1 (t 1) q1 (t) P(t 1)

q1 (t) 1 (t) (t 1)P(t 1)


1 (t 1)
q1 (t 1)

q1 (t 1)1 (t 1)
2 (t 1)
1 q1 (t 1)

Aftersomealgebra,wecanexpressthetotalvarianceas...

(t) q1 (t)[1 q1 (t)][1 (t) 2 (t)]


2

2
w

Withinclass,
frombefore

2
Betweenclass, B (t)

Sincethetotalisconstantandindependentoft,theeffectof
changingthethresholdismerelytomovethecontributionsof
thetwotermsbackandforth.
So,minimizingthewithinclassvarianceisthesameas
maximizingthebetweenclassvariance.
Thenicethingaboutthisisthatwecancomputethequantities
2

inrecursivelyaswerunthroughtherangeoftvalues.
B (t)

Result of Otsus Algorithm


An image

Binary image
by Otsus method

0.06
0.05

Matlab code:

0.04

I = double(imread('));

0.03
0.02

I = (I-min(I(:)))/(max(I(:))-min(I(:)));

0.01
0

50

100

150

200

Gray level histogram

250

300

J = I>graythresh(I);

You might also like