You are on page 1of 9

Geography 309

Lab 4

Page 1

LAB 4: UNSUPERVISED CLASSIFICATION


Question Sheets Due Date: November 5 Objectives to study some of the mechanics of unsupervised classification

Preparation Read Chapter 12 in your text.

Notes 1. Image classification is a complex task and it takes considerable effort to become comfortable with classification concepts. As in the previous Lab, you are asked to do some basic image manipulations by hand using a pencil and paper before attempting a similar operation on the computer. In Part B you will apply the concepts you have learned in Part A to classify a digital image on the computer. The manual techniques described here are very similar to the tasks performed by computer, however, the computer's great speed permits it to handle much larger images with more channels and greater radiometric range. 2. All Figures and Tables referenced in Part A are included on the Answer Sheets attached to the end of this Lab. Please use these Answer Sheets to submit your answers to Part A. 3. Detailed instructions for image classification using Geomatica can be found on-line by following the Geomatica Visual Guide link from the bottom of the course homepage. 4. At any time when you are working with the Geomatica software, you can save a "snapshot" of your work in progress by Saving it in a Project (look in the File menu). Be sure to save your projects in your personal directory space.

A. Manual Image Classification1


It is often useful to delineate spectrally distinct areas of an image, even when nothing is known about the environmental character of the resulting subdivisions or classes. These classes are then mapped and the map taken into the field to identify the classes. This procedure is known as unsupervised classification since no training sites are involved. The main advantage of using an unsupervised classification technique is that the classes are subdivided based on their statistical characteristics usually covering large geographical areas, rather than depending on a training sample which may be quite unrepresentative of the class variability over the whole scene to be mapped.

This material is derived from material presented in the publication Introduction to Digital Images and Digital Image Analysis Techniques by Tom Alfldi J.M. Piwowar 2010.10.22

Geography 309

Lab 4

Page 2

There are a large number of mathematical algorithms which use various schemes to locate and separate the statistically "cohesive" clusters in feature space, which are likely to have environmental significance. Most of these algorithms rely on finding areas of high pixel density (in feature space) which are separated by regions of low density. The following task serves to illustrate this by a less sophisticated algorithm than is actually used in practice.
Question 1: (1 mark) The feature space representation of the image in Figure 1 is shown in Figure 2. We can artificially define the high density clusters by eliminating (from view) all the low density cells. Into Figure 3, copy those cells from Figure 2 which have a count of (density of) 3 or more.

You should now see three clusters of high density cells in Figure 3. These groupings of high density cells should each only be considered the nucleus of a cluster. The next step is to define the boundaries of each whole cluster by spreading out from the nucleus.
Question 2: (1 mark) Lets call the cluster with a nucleus of only two cells Cluster A or Class 'A'. Identify each cell that touches the nucleus cells of cluster A, by marking such cells with the letter A in Figure 3. There should be l0 such cells marked, counting even those cells that touch with a corner only. Repeat the process for cluster B (with three nucleus cells) using the letter B for the neighbouring cells, and also for cluster C (with one nucleus cell) and using the letter C. There will be a point of ambiguity where two clusters overlap and a cell is identified as belonging to the neighbourhood of two clusters. A decision must be forced, so identify this conflict cell as belonging to the cluster with the larger nucleus. Draw the boundary for each cluster enclosing its complete neighbourhood in Figure 3. There should be 11 cells inside the boundary for cluster A, 15 cells in cluster B, and six cells in cluster C.

The clusters just created were defined as a uniform perimeter around each nucleus, without regard for the presence or absence of actual image data. A more accurate cluster representation can be obtained by combining the cluster definitions from Figure 3 with the pixel counts from Figure 2.
Question 3: (1 mark) Transfer the cluster boundaries from Figure 3 back to the original feature space in Figure 2. Now copy those cells in Figure 2 which fall inside the boundary for cluster A and which have a pixel density of 1 or greater into the corresponding cells in Figure 4. Identify those cells by the letter A. Repeat the procedure for clusters B and C.

Now that feature space has been (pseudo-) statistically subdivided into cohesive clusters, it remains to map these clusters into their geographical locations.

J.M. Piwowar

2010.10.22

Geography 309

Lab 4

Page 3

Question 4: (1 mark) For each pixel in the image, retrieve the band 'A' and band 'B' spectral coordinates from the digital maps of Figure 1. Determine which class to assign these coordinates to by looking them up in Figure 4. Record the class by its representative symbol (A, B, C, or leave blank for undefined) in Figure 5. Only the last three lines of pixels need to be considered, since the first four lines have been mapped for you.

The unsupervised classification in Figure 5 shows the spatial distribution of 3 classes, A, B, and C, but no attempt has been made to give each class a real label. Class identification is the next step of the unsupervised classification process. This may be done by a variety of techniques, notably airphoto interpretation, or actually visiting the site, if practical. It is not necessary to completely cover the scene in question, however. Ground verification, or ground truthing, can be directed to convenient, small, and representative locations in the image where the environmental meaning of a variety of classes may be determined. For instance, the location marked by a star in Figure 5 would be a suitable location to identify classes A, B, and C because of their proximity to each other. By investigating the class definitions in a few such locations, the class labels can be extrapolated to the larger scene with confidence.

Question 5: (2 marks) Assume that in the feature space of the image (Figure 4), band 'A' is representative of the visible spectrum and band 'B' is a near-IR band. Suggest what the 3 classes (clusters) represent? Question 6: Why was one pixel not classified in the unsupervised classification of Figure 5? (1 mark)

B. Digital Image Classification


Now that you have seen how an unsupervised classification works with test data, you are ready to try one using Geomatica. In an unsupervised classification, the computer examines your image data and attempts to find clusters of pixel values naturally occurring among the different spectral bands. These natural clusters typically represent different land covers. The analyst's job is to attach meaningful labels to the spectral classes produced by an unsupervised classification based on ancillary data gathered from field observations, aerial photographs, maps, and other sources. In this lab, you are not expected to use any of these ancillary data sources, rather you are to base your class labels on a visual interpretation of the imagery. Assignment Create an unsupervised classification of one of the geometrically correct images you used in Lab 3. If your image has black triangles on its edges, make a subset of it to eliminate these triangles and include only the image data portion.2 1. Following the procedure as outlined in the Geomatica Visual Guide, set up your image for an Unsupervised Classification. Use the Session Configuration exactly as it is shown on the Unsupervised Classification web page.
2

To subset an image use the Clipping/Subsetting function of the Tools menu. 2010.10.22

J.M. Piwowar

Geography 309

Lab 4

Page 4

2. Classify the image into 16 classes (Max. Class) using the K-Means algorithm. a. You will need to add a new channel to the image file to save this classification. b. Assign meaningful Colours, and Names to each of the classes created. Recall that unsupervised classes are based on spectral clusters in the data and may, or may not be interpretable in real-world terms. i. You may find the USGS class labels listed at http://landcover.usgs.gov/classes.php or in Table 11.1 of your text (use the Level II classes) as a useful guide. ii. Since you probably don't know which farmer was growing what types of crops on their fields in when the image was acquired, you won't be able to accurately label the crop classes try to make an educated guess based on their colours. iii. If there are two or more classes which appear to represent the same feature on the ground, give them the same labels and colours.3 c. Using the Classification Report, prepare a summary table to show the spatial extents of your classes across your image. Use the following headings in your table:
Class # Class Name # pixels Image Coverage % of image km2

Question 7: Submit a copy of your classification summary table.

(1 mark)

d. Prepare a Map Composition to show your classified image. Follow the instructions for Simple Mapping in the Geomatica Visual Guide. i. Include a Neatline, Border, Legend, and Title on your composition. ii. Change the main title to something more meaningful than the default. Use your name as the Sub-title.
Question 8: Submit your map composition of your k-means classified image.4,5 (2 marks)

Question 9: (2 marks) Using your textbook, or other reference(s), describe in 1 paragraph how the K-means algorithm works.

In practice, there are tools that you can use to merge several classes like this, but I would like you to keep them separate for now. 4 I require colour prints of all your classified images. If you do not have access to a colour printer, you may e-mail your image to me, or I can copy it onto my USB flash drive during the lab. 5 If you are submitting your image as a file, send in the .jpg file; do not submit your .prj file. J.M. Piwowar 2010.10.22

Geography 309

Lab 4

Page 5

3. Classify the image into 16 classes (Desired Clusters) using the IsoData algorithm. a. You will need to add another new channel to the image file to save this second classification. b. Assign meaningful Names, Colours, and Descriptions to each of the classes created. Assign similar names and colours as you used for the K-Means classification. c. Using the Classification Report, prepare a summary table to show the spatial extents of your classes across your image. Use the following headings in your table:
Class # Class Name # pixels Image Coverage % of image km2

Question 10: Submit a copy of your classification summary table.

(1 mark)

d. Prepare a Map Composition to show your classified image.


Question 11: Submit your map composition of your IsoData classified image.4,5 (2 marks)

Question 12: (2 marks) Using the course text, or other reference(s), describe in 1 paragraph how the IsoData algorithm works.

4. Compare your K-Means and Isodata classifications. How do the total areas for each class compare between the two images? Do you perceive one as more representative of reality?
Question 13: Prepare a paragraph summarizing your classification comparison. (2 marks)

Question 14: (2 marks) Repeat either one of your classifications but only select ETM bands 3, 4 and 7 as the Input Channels in the Session Configuration. Compare your results to your first classification where you used ETM bands 1, 2, 3, 4, 5, and 7. Would you say your results are very similar, similar, different, or very different to those of your first classification? Why? Include a copy of your classified image with your answer.

J.M. Piwowar

2010.10.22

Geography 309

Lab 4

Page 1

NAME:

MARK:

LAB 4: UNSUPERVISED CLASSIFICATION


Answer Sheets Due Date: November 5 PIXELS 3 4 5 1 1 2 4 2 1 5 2 2 2 2 2 5 2 2 3 4 4 3 4 4 BAND A PIXELS 3 4 5 0 0 0 6 0 0 7 1 1 4 4 3 7 4 5 5 5 7 6 7 7 BAND B

L I N E S

1 2 3 4 5 6 7

1 3 4 3 5 4 4 5

2 4 4 3 5 5 3 5

6 2 2 2 2 2 5 5

7 2 2 2 2 2 5 4

L I N E S

1 2 3 4 5 6 7

1 7 7 4 7 6 7 7

2 7 7 4 7 7 5 6

6 3 3 3 4 5 7 7

7 3 4 4 4 5 6 7

Figure 1: A two-band Image.


Question 1: (1 mark) The feature space representation of the image in Figure 1 is shown in Figure 2. We can artificially define the high density clusters by eliminating (from view) all the low density cells. Into Figure 3, copy those cells from Figure 2 which have a count of (density of) 3 or more.

9 8 7 6 5 4 3 2 1 0

Band 'B' Intensities

3 7 5 2 2

1 1 2 2

8 2 1

8 2

0 1 2 3 4 5 6 7 8 9
Band A Intensities Figure 2: Feature space representation.

J.M. Piwowar

2010.10.22

Geography 309

Lab 4

Page 2

9 8 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 9
Band A Intensities

Figure 3: High density clusters.


Question 2: (1 mark) Lets call the cluster with a nucleus of only two cells Cluster A or Class 'A'. Identify each cell that touches the nucleus cells of cluster A, by marking such cells with the letter A in Figure 3. There should be l0 such cells marked, counting even those cells that touch with a corner only. Repeat the process for cluster B (with three nucleus cells) using the letter B for the neighbouring cells, and also for cluster C (with one nucleus cell) and using the letter C. There will be a point of ambiguity where two clusters overlap and a cell is identified as belonging to the neighbourhood of two clusters. A decision must be forced, so identify this conflict cell as belonging to the cluster with the larger nucleus. Draw the boundary for each cluster enclosing its complete neighbourhood in Figure 3. There should be 11 cells inside the boundary for cluster A, 15 cells in cluster B, and six cells in cluster C.

J.M. Piwowar

Band 'B' Intensities

2010.10.22

Geography 309

Lab 4

Page 3

Question 3: (1 mark) Transfer the cluster boundaries from Figure 3 back to the original feature space in Figure 2. Now copy those cells in Figure 2 which fall inside the boundary for cluster A and which have a pixel density of 1 or greater into the corresponding cells in Figure 4. Identify those cells by the letter A. Repeat the procedure for clusters B and C.

9 8 7 6 5 4 3 2 1 0 0 1 2 3 4 5 6 7 8 9
Band A Intensities

Figure 4: Feature space.

J.M. Piwowar

Band 'B' Intensities

2010.10.22

Geography 309

Lab 4

Page 4

Question 4: (1 mark) For each pixel in the image, retrieve the band 'A' and band 'B' spectral coordinates from the digital maps of Figure 1. Determine which class to assign these coordinates to by looking them up in Figure 4. Record the class by its representative symbol (A, B, C, or leave blank for undefined) in Figure 5. Only the last three lines of pixels need to be considered, since the first four lines have been mapped for you.

PIXELS

1 2 3 4 5 6 7

Figure 5: Unsupervised classification.

Question 5: (2 marks) Assume that in the feature space of the image (Figure 4), band 'A' is representative of the visible spectrum and band 'B' is a near-IR band. Suggest what the 3 classes (clusters) represent?

Question 6: Why was one pixel not classified in the unsupervised classification of Figure 5?

LINES

1 A A B A

2 A A B A

3 C A A B

4 C C C B

5 C C C B

6 B B B B

7 B B B B

(1 mark)

J.M. Piwowar

2010.10.22

You might also like