You are on page 1of 9

ECE 471/571 – Lecture 7

Dimensionality Reduction – Principal


Component Analysis
Different Approaches - More Detail
Pattern Classification

Statistical Approach Syntactic Approach

Supervised Unsupervised
Basic concepts: Basic concepts:
Baysian decision rule Distance
(MPP, LR, Discri.) Agglomerative method
Parametric learning (ML, BL) k-means
Non-Parametric learning (kNN) Winner-take-all
NN (Perceptron, BP) Kohonen maps

Dimensionality Reduction Performance Evaluation Stochastic Methods


Fisher’s linear discriminant ROC curve local optimization (GD)
K-L transform (PCA) TP, TN, FN, FP global optimization (SA, GA)
ECE471/571, Hairong Qi 2
Principal Component Analysis or K-L
Transform

How to find a new feature space (m-


dimensional) that is adequate to
describe the original feature space (d-
dimensional). Suppose m<d
x2 y1
y2

x1 3
K-L Transform (1)
Describe vector x in terms of a set of
basis vectors bi.
d
x =å yi b i yi =b xT
i
i =1

The basis vectors (bi) should be linearly


independent and orthonormal, that is,
T ì1 i =j
b i b j =í
î0 i¹ j
4
K-L Transform (2)
Suppose we wish to ignore all but m (m<d)
components of y and still represent x,
although with some error. We will thus
calculate the first m elements of y and
replace the others with constants
m d m d
x =å yi b i + å yb »å yb + å a b
i i i i i i
i =1 i =m +1 i =1 i =m +1

d
Error: D x = å (yi - a i )b i
i =m +1
5
K-L Transform (3)
Use mean-square error to quantify the
error
ì d d ü
e (m )=E í å å (yi - a i )b i (y j - a j )b j ý
2 T

îi =m+1 j =m +1 þ
ì d d ü
=E í å å (yi - a i )(y j - a j )b i b j ý
T

îi =m +1 j =m +1 þ
d
{
= å E (yi - a i )
2
}
i =m +1
6
K-L Transform (4)
Find the optimal i to minimize 2
¶e 2
=- 2(E{yi }- a i ) =0
¶a i
a i =E{yi }

Therefore,
d the error is now equal to
e (m )= å E (yi - E {yi })
2
{ 2
}
i =m +1
d d

i =m +1
{ 2
}
= å E (bTi x - E {bTi x}) = å E {(bTi x - E {bTi x})(xT b i - E {xT b i })}
i =m +1
d d d
= å b E (x - E{x})(x - E{x}) b i = å b S x b i = å l i
{ }
T T T
i i
i =m +1 i =m +1 i =m +1 7
K-L Transform (5)
The optimal choice of basis vectors is the
eigenvectors of x
The expansion of a random vector in terms of the
eigenvectors of the covariance matrix is referred to
as the Karhunen-Loeve expansion, or the “K-L
expansion”
Without loss of generality, we will sort the
eigenvectors bi in terms of their eigenvalues. That
is 1 >= 2 >= … >= d. Then we refer to b1,
corresponding to 1, as the “major eigenvector”, or
“principal component”

8
Summary
Raw data  covariance matrix 
eigenvalue  eigenvector  principal
component
How to use error rate?

You might also like