You are on page 1of 6

CEE 615: Digital Image Processing

Lab 9: Principal Components Analysis

Task A: Compute and interpret the statistics (Covariance


matrix, correlation matrix and the resulting Principal
Components) for a multiband image.
Load the SPOT_XI image from the ENVI directory.
Compute the statistics for the full image.
Select Basic Tools > Statistics > Compute Statistics
Select SPOT_XI, full scene, all bands.
Select OK
In the "Compute Statistics Parameters" window, make the
selections illustrated in the figure. (Some options won't
become active until other options have been selected.)
Enter a name for the output statistics file. I suggest
SPOT_XI-PCA.sta
Select OK.
Examine the Statistics Report (next page)
Note that Bands 1, 3 & 4 have a good dynamic range (max
> 200) but the dynamic range for band 2 (red) is relatively Figure 1: Statistics Parameters
low (max =158). Window

The variance (the diagonal elements in the covariance matrix) is relatively low in the visible
(band 1:green; band 2:red), increases sharply in band
3 (NIR) and then decreases in the (SWIR).
Covariance (the off-diagonal elements of the
covariance matrix) is relatively high all band
combinations. You should have computed a
covariance "image" which will give a visual display
of the covariance matrix (Figure 2a). This is often
easier to interpret that the covariance matrix. You
may also display the correlation "image" to visualize
Figure 2: a) Covariance; b) Correlation
the correlation matrix.
From the correlation matrix (Figure 2b), it is apparent that the visible bands are highly correlated
(corr. = 0.965), and are both moderately correlated with the SWIR channel (corr. = 0684,
0.763). There is poor correlation between Band 3 and all other bands. This suggests that the
information in Band 3 is unique (or very noisy).
Consider the eigenvector table. The eigenvectors are in the rows with the weightings for the
individual bands in the columns. The first eigenvector has positive weightings for bands 1 and
2, and negative weightings for bands 3 and 4. More importantly, band 3 has the largest
(magnitude) weighting and will clearly dominate the 1st PC image. The second eigenvector is
dominated by band 4.
The eigenvalues sum to ~1992. That means that the variance "explained" by the first eigenvector
(Principal Component, PC) is 1208.6 / 1992 = 0.607 or ~61%. Most of the remaining variance is
"explained" by the second eigenvector (PC2). With less than two percent of the variance
"explained" by the last two PCs.
To summarize:
Eigenvector Description Eigenvalue % variance Cumulative variance
explained explained
1 Dominated by band 4 1208.6 0.607 0.607
2 Dominated by band 3 745.2 0.374 0.981
3 Band 4 - visible 35.5 0.018 0.999
4 Band 1 - Band 2 2.66 0.001 1.000

Compute and display the PC images for bands 1 and 2


Bands 1 and 2 are highly correlated. Applying PC to this pair
will optimally separate the most highly correlated information
in PC1 and the least correlated information in PC2 (based on
variance).
Select Transform > Principal Component > Compute
Statistic > Forward PC Rotation > PC Rotation from
New Stats
Select SPOT_XI
Select Spectral Subset, and highlight bands 1 & 2 and
Select OK.
Select OK in the Principal Components Input File
window.
Fill out the Forward PC Parameters window (Figure 3).
Be careful to use names for the files that will characterize
procedure (e.g., 2-bands, covariance).
Select OK
Display the two PC images as gray scale images in two
separate display windows.
Display the stats for this operation:
Figure 3: Forward PC Parameters
Select Basic Tools > Statistics > View Statistics File
Select the 2-band stats file.
PC1 is a nearly equal combination of bands 1 and 2 and describes more than 98% of the
variance of the image pair. PC2 is a difference image that explains less than 2% of the total
variance.

CHALLENGE: Can you tell whether or not this image has been geometrically altered?
Hint: SPOT is a pushbroom scanner.

Compute and display the PC images for all 4 bands


a. Select Transform > Principal Component > Compute Statistic > Forward PC Rotation > PC
Rotation from Existing Stats
b. Select SPOT_XI, full scene, all bands. Select OK
c. Select the Statistics file created previous step (SPOT_XI-PCA.sta).
d. Enter a name for the output PC image file. I suggest SPOT_XI-PCA.img.
e. Verify that the transformation will be based on the covariance matrix.
f. Select OK.
g. Display the 4 PC images. Notice:
i. Contrast decreases from PC 1 to 4. Compare the histograms of the images to see this
graphically.
ii. PC1 is essentially a "brightness" image.
iii. PC2 contains the bulk of the color contrast.
iv. Boundaries (edges) tend to be more pronounced in PC3 & PC4, while topographic detail is
suppressed.
v. Banding noise (low variance and uncorrelated) does not appear until PC4.
vi. Compare the eigenvector components for PC4 to the components for PC2 of the 2-band
transformation. (Try displaying a 2-D scatterplot using these two bands).
vii. Compare the eigenvector components for PC2 to the components for PC1 of the 2-band
transformation.

Recreate the original images using a subset principal component images.


a. Use the 1st two PC images to recreate the original image data. The first 2 PC images
"explain" 98% of the variance. Is this good enough?
i. Select Transform > Principal Components > Inverse PC Rotation.
ii. Choose the PC image (SPOT_XI-PCA.img).
iii. Select Spectral Subset and select the first 2 PC images. OK.
iv. OK
v. Select the appropriate stats file (SPOT_XI-PCA.sta)
vi. Select an appropriate name for the images created using the PCA inversion (SPOT_XI-
invPCA-3.img).
vii. Verify that the inversion is performed using the covariance matrix.
viii. Select OK
b. Compare the original data with the inverse 2-PC image data.
i. Display the original and transformed images side by side, one band at a time. Stretch the
frame to display the full image.
ii. Look for differences between the images. (Hint: Look at the lake. Look at the
boundaries of the lake.
iii. Link the two images and use the zoom window to examine random areas. What is your
opinion?
c. Repeat the inversion procedure using the 1st three PC images.
d. Compare the original data with the inverse 3-PC image data.
The differences are harder to see by direct visual comparison. To see ONLY the differences
you can use the Spectral Math function to subtract one image from the other.
i. Select Basic Tools > Spectral Math.
ii. Under Enter an expression enter float(s1) - float(s2). The conversion to floating point
insures that the difference will be handled correctly (e.g., avoiding byte arithmetic and
any possible confusion over the sign.)
iii. Select OK.
iv. Select S1 in the upper window
v. Select Map Variable to Input File below the second window and select the invPC image
and select OK. This assigns the invPC image to the variable S1.
vi. Use the same procedure to assign S2 to the original SPOT_XI image.
vii. Either select output to memory or choose a name for the result file.
viii. Select OK and examine the results.
Ideally, the difference image should be mostly noise. Based on the statistics, the 3 PC images
"explain" 99.9% of the variance in the image data. Generally the last little bit of uncorrelated
variance is dominated by noise and can be ignored. Indeed, when the higher order PC data are
essentially noise, the inverted data are cleaner and relatively noise free. This can be a major
improvement especially with a noisy system. The SPOT data are remarkably "clean", i.e., noise
free, and the sorting done by the PCA has not been particularly helpful. What has been removed
is what appears in the last PC image which shows the banding noise, but also shows significant
image detail.
Task B. PCA with hyperspectral data: Hyperion scene of Lansing, NY

Use the Lansing_272_clipped image on the CEE6150 Assignments page. This is a 196-band subset of a
224-bandHyperion image shown in class. (Uncalibrated and redundant bands have been removed).
Compute and display the statistics for the full scene. (Basic Tools => Statistics => Compute New
Statistics.) Be sure to request the covariance image and eigenvalue plots. Save the statistics as
a text file.
Display the covariance and correlation images.
Open the Cursor Location/Value window. You can use this to get specific values for locations in
the covariance and correlation images.

Evaluate the statistics for the hyperspectral scene.


Based on a visual inspection of the correlation matrix, how many distinct spectral regions are
there in the image data? Please identify the regions by band number and wavelength range.
(The position of the cursor in the Cursor Location/Value window corresponds to the band
numbers. The wavelength for each band can be found in the available bands list.)
Note:
The light gray band in the visible corresponds to the green peak in vegetation.
The gray bands in the NIR correspond to atmospheric water absorption bands.
The dark bands in the SWIR correspond to strong atmospheric water absorption bands.
What are your criteria for selecting the number of unique spectral ranges?
What is the typical variance for each range? (You can get a better visual idea of this from the
spectral plot of the standard deviation in the statistics window.)
Perform the principal components transformation on the full data set.
Use the covariance criterion for the transformation.
Be careful to name the statistics file in a way that will make it easily identifiable.
How many of the PC images appear to have usable information?
How many of the PC images are dominated by noise?
What percentage of the variance in this subset is "explained" by the PC images with obviously
usable information?
Perform the inverse PCA using only the first 4 PC images. Evaluate the effect of the inverse
PCA using the limited set of PC images.
Display the animation set for the inverse PCA bands AND the animation set for the same band
range of the original image.
Sort through the original images looking for spectral regions that are obviously contaminated with
noise. In those regions, compare bands from the inverted PCA and the original data.
Is there noise in the original images that has been removed in the inverse PC images?
Is there any noise or other artifact in the inverse PC images that was not in the original data?
Is there noise or other artifacts in the inverse PC images that have not been removed? If so, can
you posit a reason why this would have happened?
Link the original and inverse PCA images (Tools > Link > Link Displays), then display the
spectral profile for the original and inverse PCA data (Tools > Profile > z-profile (spectrum)).
set for the inverse PCA bands AND the original image data.
Arrange the images, the zoom images and the spectral profiles so that it is easy to view all
together.
Examine spectra in homogeneous areas (water, forest, soil) and compare spectra for the two image
data sets. Click on an area in the zoom window and then move the cursor using the arrow keys.
Water:
Note the relative noisiness of the two spectra especially in the SWIR where water is
essentially black.
Note the relative stability of the spectra as you move the cursor through the water area.
Forest:
Note the relative clarity of the green peak in the visible.
Observe the noise level in the SWIR.
Soil
Consider the relative smoothness of the spectra in the inverse PCA images
Consider the stability of the inverse PCA spectra relative to the original images.

Does it appear that 4 PCA was sufficient to characterize the full range of spectral detail in the
196 band image set?
Perform the principal components transformation on the spectral range, 700-1300 nm.
Use the covariance criterion for the transformation.
Be careful to name the statistics file in a way that will make it easily identifiable.
How many of the PC images appear to have usable information?
How many of the PC images are dominated by noise?
What percentage of the variance in this subset is "explained" by the PC images with obviously
usable information?
Perform the inverse PCA using only the first 3 PC images. Evaluate the effect of the inverse
PCA using the limited set of PC images.
Display the animation set for the inverse PCA bands AND the animation set for the same band
range of the original image.
Sort through the images, comparing the filtered and original data for each band and locate any
obvious differences between the data sets.
Are there bands or band ranges in which the data are obviously altered?
Is there noise in the original images that has been removed in the inverse PC images?
Is there any noise or other artifact in the inverse PC images that was not in the original data?

You might also like