You are on page 1of 2

Cluster Analysis in GIS for Site Similarity Assessment

Minghua Zhang 1 , Paul Hendley 1 , and YinYan Guo 2 . 1Environmental Fate and Risk Assessment, Zeneca Ag Products, 1200 S. 47th Street, Richmond, CA 94804-4610 2Department of Land, Air and Water Resources, University of California, Davis, Davis, CA 95616

Abstract

Surface water monitoring for agrochemical residues is increasingly becoming a part of the agrochemical post-registration process under FIFRA and FQPA. This study uses the ARP (Acetochlor Registration Partnership) surface water monitoring data to demonstrate a fledgling statistical approach to determine similarities between residue patterns at CWS (Community Water Supply) sites used in the ARP program. Simple correlation coefficients in representing similarity were used in cluster analysis. Eleven pairs of clusters were found to have the similarity index of r >~ 0.8 and these pairs showed similar herbicide residue patterns in surface water. GIS analysis revealed that these paired sites are close to one another on the same river systems. The study demonstrated the utility of statistical analysis of historical data coupled with GIS to refine site selection for surface water monitoring program.

Introduction
Surface water monitoring for agrochemical residues is increasingly becoming a part of the agrochemical post-registration process under FIFRA and FQPA. Because of the public perception that monitoring data represents reality better than modeling results, long term monitoring programs are often requested to assist pesticide regulatory decision making. One issue is how to most effectively deploy scarce sampling resources to provide the maximum return on investment. This study uses the ARP (Acetochlor Registration Partnership) surface water monitoring data to demonstrate a fledgling statistical approach to determine similarities between residue patterns at CWS (Community Water Supply) sites used in the ARP program. A better understanding of what causes sites to provide similar or dissimilar residue patterns should permit more cost-effective planning of monitoring programs.

Results
When we used the criteria of the calculated distance of dissimilarity coefficient at D=0.8, we generated 12 clusters for 1995, 12 for 1996 and 16 clusters for 1997 (Examples in Figure 1, 2 and Table 1 for 1997). These clusters were further refined using crude criteria such as (1) sites close to one another and (2) sites with the same type of water source. As a result, we identified eleven pairs. These pairs have similar residue patterns by month throughout the three years (Example in Figure 3). Visual assessment in GIS revealed that many of these paired sites were close to one another on the same river systems and share similar descriptions in site specific landscape variables (Table 2). Further assessment found the eight pairs out of the eleven were CWS sites withdrawing water from river systems, two pairs were sites from reservoir and one pair was sites from lakes (Figure 4). These paired sites have high correlation coefficients for residue values (Table 3). The F-test showed no significant differences in total variations of the residues in the water systems between the values from all sites and values exlcuding the similar sites. The results from F-test confirmed that the herbicide residue patterns from each pair were similar. The study demonstrated the utility of statistical analysis of monitoring data coupled with GIS to refine site selection for surface water monitoring programs.
Table 1. Clusters of sites in 1997 at D=0.8
Cluster 1 Elements 10 547 487 159 301 544 310 534 168 149 345 212 305 219 582 158 606 684 652 277 562 1065 239 569 355 25 330 455 217 577 166 603 245 222 565 1066 1071 147 170 182 574 556 576 73 566 636 608 729 730 699 997 351 1091 150 152 334 335 593 77 596 71 696 408 737 451 1003 1006 143 1005 518 344 258 400 1069 362 259 213 214 352 1067 225 570 437 413 248 403 249 328 303 395 511 1082 157 372 129 359 1013 320 519 374 244 571 1038 454 350 485 315 506 530 531 412 169 89 233 527 702 4 1035 1053 386 1098 242 172 7 18 579 314 307 354 304 1032 1060 268 1009 371 443 1046 461 321 279 269 470 184 155 548 452 296 532 1070 261 341

Materials and Methods


ARP surface water monitoring data from 175 sites and three years (1995-1997) were used in the cluster analysis. A similarity index was represented by a correlation coefficient r(x i, xj). This coefficient measures the similarity of pesticide residue concentrations between each pair of sites within each year: n

(x
1

r(x i, xj )

x )( x x )
i j j xi xj

Where i, j =1,2, ... ,175. n : number of observations for three chemicals Dissimilarity coefficients were also calculated using the following equation: d(xi, xj) = 1- r(x i, xj ) Where i, j =1,2, ... ,175. Average linkage method was used in calculating the distance between two clusters, which is defined as:

Distance of dissimilarity coefficients


0.2 0.4 0.6 0.8 1.0 1.2

332 769 601 1016 58 2 1058 343 3 4 114 197 1076 676 537


129-VF-KS 343-PA-IN 305-BL-NE 362-FW-IN 248-MO-IL 249-RO-IL 303-OM-NE 114 -RI-KS 219-SH-IL 197-EL-IL 582-WI-IA 566-LE-IA

Cluster 2

Cluster 3

Cluster 4

6 7 8 9

651 125 557 1039 13

DKL =
: Kth

iCK jC L

d(x , x )/(N N )
i j k L

Where Ck cluster, CL cluster NK: Number of observations in Ck, NL : Number of observations in CL We applied GIS as a visual aid to identify pairs within the same water systems. Finally, the F-test was used to examine the total variations with and without the values from all sites.

: Lt h

Figure 1. Part of the large cluster tree for the data of 1997

10 11 12 13 14 15 16

275 198 17 865 1054 228 622

Figure 4. Eleven pairs of sites with similar residue patterns and landscape variables Figure 2. Clusters generated from cluster analysis for the data of 1997 Table 2. Relevant information of landscape structure and landuse for the selected monitoring sites.
Sites 303 305 651 652 71 77 593 636 997 443 519 606 608 152 596 729 314 335 547 579 544 548 182 183 State NE NE DE DE KS KS PA PA PA OH OH IL IL IL PA PA IN IN IA IA IA IA IL IL Stratum Continental Rivers Continental Rivers 5-10% CI 5-10% CI Continental Rivers Continental Rivers 11-20% CI 11-20% CI 11-20% CI >20% CI >20% CI >20% CI >20% CI >20% CI >20% CI 5-10% CI Continental Rivers Continental Rivers >20% CI >20% CI 11-20% CI 11-20% CI >20% CI >20% CI Corn Intensity 0.0348 0.0332 0.1039 0.1060 0.0514 0.0508 0.1210 0.1261 0.1248 0.2427 0.2924 0.2535 0.2626 0.2521 0.2635 0.1044 0.0557 0.0569 0.3257 0.3624 0.1390 0.1055 0.2913 0.2913 Water treatment PAC PAC PAC PAC PAC PAC PAC or GAC PAC GAC Other Other GAC Other PAC Other Other PAC PAC Other Other Other Other PAC PAC Watershed Size (ac) 204687765.8 203739516.1 43629.4 100408.7 268749082.4 267061175.7 284337.4 242629.2 293855.4 131174.2 16385.2 3274132.1 2844480.5 480357.6 12200.8 771279.2 68358056.1 68778137.5 14865.7 43134.9 1458.0 6453.1 723.9 613.3 CN 76.710 76.658 88.000 87.434 77.315 77.253 88.000 88.000 88.697 86.000 91.000 91.596 91.266 93.134 89.000 87.470 85.688 85.696 68.000 78.000 94.000 90.000 85.000 90.000

Table 3. Correlation coefficients for the selected sites


Sites 303 305 651 - 652 71 - 77 593 - 997 593 636 636 997 443 - 519 606 608 608 152 606 - 152 596 - 729 314 - 335 547 - 579 544 - 548 182 - 183 Type River River River River River River River River Lake Reservoir Reservoir 1995 0.91 0.71 0.95 0.76 0.74 0.90 0.91 0.96 0.88 0.82 0.86 0.96 0.95 0.77 0.98 1996 0.95 0.96 0.82 0.86 0.89 0.97 0.90 0.88 0.85 0.79 0.89 0.89 0.91 0.96 0.95 1997 0.88 0.85 0.85 0.90 0.94 0.94 0.79 0.97 0.79 0.76 0.90 0.81 0.82 0.94

Conclusions
1. Eleven pairs were selected from the analysis. The pairs identified that met the following criteria: (1) distance of dissimilarity coefficient D=0.8 specifying similarity in residues by sampling occasion across 3 seasons; (2) sites with the same type of water source; and (3) sites with reasonably close to one another. 2. Eight pairs were sites in the river systems (Figure 4) 3. All these similar sites share similar landscape structure and landuse patterns (see Table 2) 4. This study demonstrates an approach in combining the spatial analysis with conventional statistics. 5. The results provide information for selecting monitoring sampling sites.

Figure 3. Residue patterns and levels of 1995-1997 for Acetochlor, Atrazine and Alachlor at the given sites.

You might also like