Professional Documents
Culture Documents
Alexander Wong
Outline
3 Prototype Selection
Definition:
1
gk (x) = −z k T x + z k T z k (4)
2
Therefore, MED classification made based on minimum
discriminant for given x:
1
g(x) = (z k − z l )T x + (z l T z l − z k T z k ) = 0 (7)
2
g(x) = w T x + wo = 0 (8)
Prototype Selection
Nk
1 X
zk (x) = xi (9)
Nk
i=1
Advantages:
+ Less sensitive to noise and outliers
Disadvantages:
- Poor at handling long, thin, tendril-like clusters
Definition:
Advantages:
+ Better at handling long, thin, tendril-like clusters
Disadvantages:
- More sensitive to noise and outliers
- Computationally complex (need to re-compute all
prototypes for each new point)
Definition:
Advantages:
+ More tight, compact clusters
Disadvantages:
- More sensitive to noise and outliers
- Computationally complex (need to re-compute all
prototypes for each new point)
Idea:
Nearest neighbor is sensitive to noise, but handles long,
tendril-like clusters well
Sample mean is less sensitive to noise, but poorly handles
long, tendril-like clusters
What if we combine the two ideas?
Definition: For a given x you wish to classify, you compute
the distance between x and all labeled samples, and you
define the prototype in each cluster as the samplemean of
the K samples within that cluster that is nearest x.
Advantages:
+ Less sensitive to noise and outliers
+ Better at handling long, thin, tendril-like clusters
Disadvantages:
- Computationally complex (need to re-compute all
prototypes for each new point)
NN decision boundary:
z 1 = 14 [12 12]T
z 1 = [3 3]T .
(12)
1 PN2
z2 = N2 i=1 x i
z 2 = 14 [3 3]T + [4 4]T + [3 9]T + [6 4]T
z 1 = 14 [16 20]T
z 2 = [4 5]T .
(13)
1 1
−z 1 T x + z 1 T z 1 < −z 2 T x + z 2 T z 2 (18)
2 2
T T
−[3 3]T [x1 x2 ]T + 12 [3 3]T [3 3]T
T T (19)
< −[4 5]T [x1 x2 ]T + 21 [4 5]T [4 5]T
1 1
−[3 3][x1 x2 ]T + [3 3][3 3]T < −[4 5][x1 x2 ]T + [4 5][4 5]T
2 2
(20)
41
−3x1 − 3x2 + 9 < −4x1 − 5x2 + (21)
2
Therefore, the discriminant functions are:
41
g2 (x1 , x2 ) = −4x1 − 5x2 + (23)
2
1
g(x) = (z k − z l )T x + (z l T z l − z k T z k ) = 0 (24)
2
g(x1 , x2 ) = g1 (x1 , x2 ) − g2 (x1 , x2 ) = 0. (25)
Plugging in the discriminant functions g1 and g2 gives us:
41
g(x1 , x2 ) = −3x1 − 3x2 + 9 − (−4x1 − 5x2 + ) = 0 (26)
2
Grouping terms:
41
−3x1 − 3x2 + 9 + 4x1 + 5x2 − =0 (27)
2
23
2x2 + x1 −
=0 (28)
2
1 23
x2 = − x1 + (29)
2 4
Therefore, the decision boundary is just a straight line with
a slope of − 21 and an offset of 23
4 .