Professional Documents
Culture Documents
(PCA)
2
Applications
• Uses: • Examples:
– Data Visualization – How many unique “sub-sets” are
in the sample?
– Data Reduction
– How are they similar / different?
– Data Classification
– What are the underlying factors
– Trend Analysis that influence the samples?
– Noise Reduction – Which time / temporal trends are
(anti)correlated?
– Which measurements are needed
to differentiate?
– How to best present what is
“interesting”?
Cluster Analysis
How does Principal Components
Analysis Differ from Cluster Analysis?
4
How does PCA Work?
• PCA transforms a set of “k” correlated variables into a new set of
uncorrelated variables called principal components.
• Prin 2 is not correlated with the first principal component, and accounts
for the second-most variation in the data set., and so on.
5
Steps in PCA
6
Principal Components Analysis on:
• Covariance Matrix:
– Variables must be in same units
– Emphasizes variables with most variance
– Mean eigenvalue ≠1.0
• Correlation Matrix:
– Variables are standardized (mean 0.0, SD 1.0)
– Variables can be in different units
– All variables have same impact on analysis
– Mean eigenvalue = 1.0
Principal Components Analysis in JMP
8
Selecting the Best Number of Principal
Components - method 1
(using princomp.jmp from Klimberg & McCullough, pg. 143)
9
Selecting the best number of principal
components – methods 2 & 3
10
Another Example – MassHousing.jmp
12
Principal Components Analysis- Mass
Housing Data
The Scree Plot shows
that after the first
Principal Component,
the explanation of
variability drops off quite
a bit.
13
Principal Components Analysis- Mass
Housing Data (continued)
14
Loading Matrix from
Mass Housing PCA Analysis
15
Loading Plot for Components 1 & 2
16
Comparing Multiple Regression Results – Principal
Components vs. Original Independent Variables
17
Summary and key points
18