You are on page 1of 11

Environ Monit Assess (2009) 159:543–553

DOI 10.1007/s10661-008-0650-6

Assessment of surface water quality using multivariate


statistical techniques: a case study of Behrimaz
Stream, Turkey
Memet Varol · Bülent Şen

Received: 10 April 2008 / Accepted: 5 November 2008 / Published online: 3 December 2008
© Springer Science + Business Media B.V. 2008

Abstract Multivariate statistical techniques, such related to discharge, temperature, and solu-
as cluster analysis (CA), principal component ble minerals (natural) and nutrients (nonpoint
analysis, and factor analysis, were applied for the sources: agricultural activities) in relatively less
evaluation of temporal/spatial variations and for polluted areas; and organic pollution (point
the interpretation of a water quality data set of source: domestic wastewater) and nutrients (non-
the Behrimaz Stream, obtained during 1 year of point sources: agricultural activities and surface
monitoring of 20 parameters at four different sites. runoff from villages) in medium polluted ar-
Hierarchical CA grouped 12 months into two pe- eas in the basin. Thus, this study illustrates the
riods (the first and second periods) and classified utility of multivariate statistical techniques for
four monitoring sites into two groups (group A analysis and interpretation of data sets and, in
and group B), i.e., relatively less polluted (LP) water quality assessment, identification of pol-
and medium polluted (MP) sites, based on sim- lution sources/factors and understanding tempo-
ilarities of water quality characteristics. Factor ral/spatial variations in water quality for effective
analysis/principal component analysis, applied to stream water quality management.
the data sets of the two different groups obtained
from cluster analysis, resulted in five latent factors Keywords Behrimaz stream · Water quality ·
amounting to 88.32% and 88.93% of the total Multivariate techniques · Cluster analysis ·
variance in water quality data sets of LP and Factor analysis · Principal component analysis
MP areas, respectively. Varifactors obtained from
factor analysis indicate that the parameters re-
sponsible for water quality variations are mainly Introduction

Surface water quality is affected by both anthro-


M. Varol (B) pogenic activities and natural processes. Natural
Ministry of Agriculture and Rural Affairs,
processes influencing water quality include pre-
Province Control Laboratory, 21010, Diyarbakir,
Turkey cipitation rate, weathering processes, and sedi-
e-mail: mvarol23@gmail.com ment transport, whereas anthropogenic activities
include urban development and expansion and
B. Şen
industrial and agricultural practices. These activ-
Faculty of Aquaculture, Department of Basic Aquatic
Sciences, Fýrat (Euphrates) University, Elazığ, ities often result in the degradation of water qual-
Turkey ity, physical habitat, and biological integrity of
544 Environ Monit Assess (2009) 159:543–553

lotic system (Carpenter et al. 1998; Qadir et al. basin (Simeonov et al. 2003), and interpret tempo-
2007). Increasing exploitation of water resources ral and spatial variations in water quality (Dixon
in catchment is responsible for much of the pol- and Chiswell 1996; Singh et al. 2004). Spatio-
lution load (Singh et al. 2005). On the other hand, temporal monitoring of stream water quality has
rivers and streams play a major role in assimilation been used as one of the most important tools
or carrying off the municipal and industrial waste- for water quality assessment (Singh et al. 2004;
water and run-off from agricultural land. The Shrestha and Kazama 2007).
municipal and industrial wastewater discharge The application of different multivariate sta-
constitutes the constant polluting source, whereas tistical techniques, such as cluster analysis (CA),
the surface run-off is a seasonal phenomenon, principal component analysis (PCA), and fac-
largely affected by climate in the basin. (Vega tor analysis (FA), helps in the interpretation of
et al. 1998; Singh et al. 2004). Spatial and temporal complex data matrices to better understand the
variability in water chemistry in rivers and streams water quality and ecological status of the stud-
is directly related to different factors. Rivers and ied systems, allows the identification of possible
streams are highly heterogeneous at different spa- factors/sources that influence water systems, and
tial scales. The spatial heterogeneity within the offers a valuable tool for reliable management
stream is due to local environmental conditions of water resources as well as rapid solution to
(e.g., light, temperature, discharge, and water ve- pollution problems (Vega et al. 1998; Wunderlin
locity) that change through time and differences et al. 2001; Reghunath et al. 2002; Simeonova et al.
in local channel form, while degree of temporal 2003; Shrestha and Kazama 2007). Multivariate
variability of surface water chemistry varies as a statistical techniques have been applied to charac-
function of stream/river type and depends on the terize and evaluate surface and freshwater quality,
chemical parameter of interest. This variation may since they are useful in verifying temporal and
also be due to both hydrologic inputs which can spatial variations caused by natural and anthro-
originate from precipitation, direct overland flow, pogenic factors linked to seasonality (Helena et al.
subsurface flow through shallow soils, drainage 2000; Singh et al. 2004, 2005; Shrestha and Kazama
from shallow and deep aquifers, and instream 2007; Qadir et al. 2007).
processes which include dilution, metal release, In the present study, data sets obtained during
and adsorption from sediments as well as precipi- 1 year of monitoring program were subjected to
tation (Qadir et al. 2007). different multivariate statistical techniques to get
Growing population, increased economic ac- information about the similarities or dissimilar-
tivity, and industrialization have resulted in an ities among the monitoring periods or sites, to
increased water demand. In addition, rapid ur- identify water quality variables responsible for
banization is changing patterns of consumption. spatial and temporal variations in stream water
This has caused a severe misuse of water re- quality, as well as the influence of possible sources
sources. Rivers, streams, and their tributaries (natural and anthropogenic) on the water quality
passing through the cities are receiving large parameters of the Behrimaz Stream Basin.
amount of contaminants released from industrial,
domestic/sewage, and agricultural effluents, which
has resulted in an increasing degradation of fresh- Materials and methods
water ecosystem mainly by eutrophication (Qadir
et al. 2007). Indiscriminate discharge of these ef- Study area
fluents (either from industrial, municipal, or agri-
cultural activities) containing toxic substances into Behrimaz Stream, which is fed with springs arising
aquatic environment creates problems of water near Başkaynak Village in the western part of the
pollution rendering water no longer fit for drink- Behrimaz Plain, is in the South of Hazar Mountain
ing, agriculture, and aquatic life (Fent 2004; Qadir in Eastern Anatolia Region of Turkey and has a
et al. 2007). Therefore, it is important to control catchment area of 101 km2 (Özdemir 1995a). The
water pollution, monitor water quality in stream Behrimaz Stream is the largest tributary to the
Environ Monit Assess (2009) 159:543–553 545

Lake Hazar, contributing approximately 70% of Monitored parameters and analysis methods
the total freshwater flow to the lake (Şen et al.
2002). The stream, on which an irrigation dam was The data for four water quality monitoring sta-
built around Hatun Village, discharges into the tions, consisting of 20 water quality parameters,
Lake Hazar from the Eastern shore (Fig. 1). The were monitored over 1 year. Grab water samples
land uses within the watershed of the Behrimaz (2.5 l) were collected at monthly intervals from
Stream are agriculture (86.6%), forest, and rural- sampling sites between January 2003 and Decem-
residential, but urbanization is becoming increas- ber 2003. As the irrigation dam was closed from
ingly important, especially in the areas adjacent July 2003 to October 2003, the samples were not
to the downstream region. Although the climate taken from stations 2 and 3 during these months.
of the area is terrestrial, the climate also shows The selected parameters included flow (Q), wa-
similarities to Mediterranean type (Günek and ter temperature (WT), pH, electrical conductiv-
Yiğit 1995; Yiğit and Çitçi 1995). More than 90% ity (EC), dissolved oxygen (DO), total hardness
of the annual rainfall usually occurs from Novem- (TH), total alkalinity (TA), ammonium nitrogen
ber through May. The annual mean rainfall in the (NH4 -N), total kjeldahl nitrogen (TKN), organic
period 1991–2003 was 616 mm. The annual mean nitrogen (ON), nitrite nitrogen (NO2 -N), nitrate
air temperature is 12–14◦ C. Summers in the area nitrogen (NO3 -N), total phosphorus (TP), silica
are cooler and less dry than its surrounding areas; (SiO2 ), sulfate (SO4 ), sodium (Na), potassium
winters are rainier and colder. The area is cov- (K), calcium (Ca), total suspended solids (TSS),
ered with red-brown soils. Colluvial and alluvial and total dissolved solids (TDS). All the water
soils are present at low regions in the watershed. quality parameters were expressed in milligram
Lithosolic soils are in the areas where erosion is per liter (mg l−1 ), except Q (l s−1 ), pH, EC
very severe. In addition, stream bank erosion of- (μS cm−1 ), and WT (◦ C). Flow was calculated
ten occurs during peak flow periods. Afforestation from velocity profile measured by floating a buoy-
works have been continuing to prevent erosion in ant object along parallel longitudinal transects in
these areas since 1968 (Yiğit and Çitçi 1995). a 5-m length section of the stream with uniform
In the present study, four stations were selected width. The results were corrected in accordance
from the stream water quality monitoring net- with bed material. The sampling, preservation,
work. The first two stations were located in the transportation, and analysis of the water samples
upstream region of the irrigation dam. The other were performed according to standard methods
two stations were in the downstream region of the (APHA 1995). The analytical data quality was en-
irrigation dam. sured through careful standardization, procedural

Fig. 1 Map of study area


and surface water quality
monitoring stations (1–4)
in the Behrimaz Stream
Basin
546 Environ Monit Assess (2009) 159:543–553

blank measurements, and spiked and duplicate due to wide differences in data dimensionality
samples. (Liu et al. 2003; Singh et al. 2004, 2005). Standard-
ization tends to increase the influence of variables
whose variance is small and reduce the influence
Data treatment and multivariate statistical of variables whose variance is large. Furthermore,
methods the standardization procedure eliminates the in-
fluence of different units of measurement and
The Kolmogorov–Smirnov (K-S) statistics were renders the data dimensionless.
used to test the goodness-of-fit of the data to
lognormal distribution (Shrestha and Kazama
Cluster analysis
2007). According to the K-S test, all the vari-
ables were lognormally distributed with 95% or
Cluster analysis is a group of multivariate tech-
higher confidence. Similarly, to examine the suit-
niques whose primary purpose is to assemble ob-
ability of the data for principal component analy-
jects based on the characteristics they possess.
sis/factor analysis, Kaiser–Meyer–Olkin (KMO)
Cluster analysis classifies objects so that each ob-
and Bartlett’s Sphericity tests were performed
ject can be similar to the others in the cluster
(Shrestha and Kazama 2007). KMO is a measure
with respect to a predetermined selection crite-
of sampling adequacy that indicates the propor-
rion. The resulting clusters of objects should then
tion of variance which is common variance, i.e.,
exhibit high internal (within-cluster) homogeneity
which might be caused by underlying factors. High
and high external (between-cluster) heterogene-
value (close to 1) generally indicates that principal
ity. Hierarchical agglomerative clustering is the
component/factor analysis may be useful, as it is
most common approach, which provides intuitive
the case in this study: KMO = 0.675. Bartlett’s
similarity relationships between any one sample
test of sphericity indicates whether correlation
and the entire data set, and is typically illustrated
matrix is an identity matrix, which would indicate
by a dendrogram (tree diagram) (Singh et al.
that variables are unrelated. The significance level
2004; Shrestha and Kazama 2007). The dendro-
which is 0 in this study (less than 0.05) indicates
gram provides a visual summary of the cluster-
that there are significant relationships among
ing processes, presenting a picture of the groups
variables.
and their proximity, with a dramatic reduction in
In order to account for nonnormal distribu-
dimensionality of the original data. The Euclid-
tion of the measured water quality parameters,
ean distance usually gives the similarity between
the correlation structure between the variables
two samples, and a distance can be represented
was studied by using the Spearman R coefficient
by the difference between analytical values from
as a nonparametric measure of the correlation
the samples (Zhou et al. 2007). In this study,
between the variables, which is computed over
hierarchical agglomerative CA was performed on
ranked data (Wunderlin et al. 2001; Singh et al.
the normalized data set by means of the Ward’s
2004; Shrestha and Kazama 2007; Zhou et al.
method, using Euclidean distances as a measure
2007). In the present study, the temporal varia-
of similarity.
tions of the stream water quality parameters were
first evaluated through season parameter corre-
lation matrix, using the Spearman nonparametric Principal component analysis/factor analysis
correlation coefficient (Spearman’s R). The water
quality parameters were grouped into different PCA is designed to transform the original vari-
periods based on temporal CA, and each period ables into new, uncorrelated variables (axes),
was assigned a numerical value. called the principal components, which are linear
Cluster analysis and FA/PCA were applied on combinations of the original variables. The new
experimental data standardized through z-scale axes lie along the directions of maximum variance.
transformation in order to avoid misclassification PCA provides an objective way of finding indices
Environ Monit Assess (2009) 159:543–553 547

of this type so that the variation in the data can be patterns to water quality were not purely consis-
accounted for as concisely as possible (Sarbu and tent with the four seasons or the dry/wet seasons.
Pop 2005). PC provides information on the most
meaningful parameters, which describes a whole Spatial similarity and site grouping
data set affording data reduction with minimum
loss of original information (Helena et al. 2000; Spatial CA rendered a dendrogram (Fig. 3), where
Shrestha and Kazama 2007). all the four sampling sites on the stream were
FA follows PCA. The main purpose of FA is grouped into two statistically significant clusters
to reduce the contribution of less significant vari- at (Dlink /Dmax )× 100 < 60. Group A consisted of
ables to simplify even more of the data structure stations 1 and 2. Group B consisted of stations 3
coming from PCA. PC is a linear combination of and 4. The group classifications varied with signif-
observable water quality variables, whereas vari- icance level because the sites in these groups had
factors (VF) can include unobservable, hypothet- similar characteristic features and natural back-
ical, latent variables (Vega et al. 1998; Helena grounds that were affected by similar sources.
et al. 2000). PCA of the normalized variables was Group A (stations 1 and 2) corresponds to rela-
performed to extract significant PCs and to further tively less polluted (LP) sites. In group A, stations
reduce the contribution of variables with minor are situated at upstream sites of the stream. These
significance; these PCs were subjected to varimax stations receive pollution from nonpoint sources,
rotation (raw) generating VFs (Brumelis et al. i.e., mostly from agricultural activities. Group B
2000; Singh et al. 2004, 2005; Love et al. 2004; (stations 3 and 4) corresponds to relatively mod-
Shrestha and Kazama 2007). erate pollution (MP) sites. In group B, stations
are situated at downstream sites of the stream.
These stations receive pollution from point and
nonpoint sources, i.e., agricultural and livestock
Results and discussion
farms, domestic wastewater, and surface runoff
from villages.
The basic statistics of the 1-year data set on stream
Hierarchical CA provided a useful classification
water quality are summarized in Table 1.
of the surface watercourses in the study area,
which could be used to design an optimal fu-
Temporal similarity and period grouping ture spatial monitoring network with lower cost
(Simeonov et al. 2003; Singh et al. 2004). Accord-
Temporal CA generated a dendrogram (Fig. 2), ing to the above results, the frequency of monitor-
grouping the 12 months into two clusters at ing sites might be decreased, and the monitoring
(Dlink /Dmax )× 100 < 30, and the difference be- periods could only be selected from the first and
tween the clusters was significant. Cluster 1 (the second periods; also, the number of monitoring
first period) included January, April, May, No- sites could also be reduced and chosen only from
vember, December, February, March, and June, groups A and B.
approximately corresponding to the wet season
in Turkey (December to May). Cluster 2 (the Temporal variations in water quality
second period) included the remaining months
(July, August, September, and October), closely Temporal variation in water quality parameters
corresponding to the dry season (June to Novem- (Table 1) were evaluated by using a period–
ber). However, if 12 months had been empiri- parameter correlation matrix, which showed that
cally divided into spring (March to May), summer most analyzed parameters were significantly cor-
(June to August), autumn (September to Novem- related ( p < 0.05) with period, except pH, NO2 -
ber), and winter (December to February), or into N, NO3 -N, TP, SiO2 , and SO4 . Among these,
dry/wet seasons, a mistake in grouping could have K, Na, and flow exhibited the highest correlation
been made. In fact, Fig. 2 shows that the temporal coefficients (Spearman’s R = 0.69, 0.69, and
548

Table 1 Mean, standard deviation, and maximum and minimum values of water quality parameters at different locations of the Behrimaz Stream
Variable Station 1 Station 2 Station 3 Station 4
Mean SD Max Min Mean SD Max Min Mean SD Max Min Mean SD Max Min
Q 1,054.166 1,135.704 3,153 36 1,049.583 1,143.146 3,108 41 1,496.25 1,084.132 3,044 356 1,404 1,051.366 2,952 348
T 12.141 8.541 25.5 2.3 13.383 10.794 32.6 2.4 8.8375 8.227 25.8 2.3 8.8 8.265 25.2 2.4
pH 8.083 0.345 8.6 7.5 8.091 0.257 8.6 7.7 8.162 0.266 8.4 7.7 8.062 0.219 8.3 7.7
EC 225.833 20.652 260 190 233.333 30.550 300 190 210 17.728 230 180 227.5 19.086 250 200
DO 9.383 0.627 10.8 8.4 9.078 0.801 10.4 7.8 9.837 1.674 11.6 6.5 9.775 2.342 11.8 4.6
TH 168.083 21.296 194 136 173.5 24.126 204 138 164.25 20.005 195 139 175.75 22.205 222 150
TA 163 26.690 192 126 172.083 34.171 224 128 159.25 26.980 190 128 166.25 31.684 204 130
NH4 -N 1.415 0.839 2.38 0.18 1.656 1.022 3.39 0.17 1.276 1.102 3.35 0.16 1.188 1.004 3.01 0.16
TKN 2.563 1.295 4.51 0.42 2.818 1.410 5.44 0.48 2.501 1.408 5.32 0.65 2.338 1.428 5.16 0.52
ON 1.147 0.590 2.2 0.24 1.161 0.505 2.05 0.31 1.2 0.480 1.97 0.49 1.15 0.521 2.15 0.36
NO2 -N 0.0182 0.0029 0.0233 0.0134 0.0196 0.0037 0.0244 0.0130 0.0219 0.0031 0.0266 0.0157 0.0229 0.0040 0.0313 0.0173
NO3 -N 1.217 0.426 1.977 0.312 1.261 0.438 2.080 0.374 1.396 0.593 2.365 0.463 1.363 0.590 2.238 0.445
TP 0.179 0.047 0.266 0.130 0.181 0.049 0.317 0.135 0.196 0.045 0.283 0.145 0.182 0.040 0.267 0.147
SiO2 1.120 0.123 1.328 0.827 1.167 0.148 1.404 0.851 1.123 0.131 1.361 0.975 1.079 0.143 1.318 0.831
SO4 21.991 1.379 24.8 19.8 23.391 1.648 25.8 20.4 24.037 2.019 27 21.2 24.725 2.575 30.4 22.4
Na 13.745 3.208 18.9 9.14 14.471 4.181 20.9 9.27 12.076 2.035 14.6 9.14 11.871 1.186 13.2 9.67
K 1.023 0.393 1.6 0.34 1.105 0.456 1.8 0.34 0.826 0.257 1.09 0.36 0.847 0.246 1.05 0.37
Ca 36.426 7.573 46 24.1 40.42 10.120 56.4 24.7 36.317 6.644 45.04 27.1 38.19 5.853 46.92 30.4
TSS 374.833 457.873 1,413 28 443.583 564.018 1,828 34 605.125 510.111 1,395 76 726.375 681.601 2,060 108
TDS 117.166 11.582 134 96 123.833 11.975 141 103 112.5 12.682 140 96 125.5 12.950 148 114
Environ Monit Assess (2009) 159:543–553
Environ Monit Assess (2009) 159:543–553 549

Fig. 2 Dendrogram
showing clustering of
monitoring periods
Jan.
Apr.
May
Nov.
Dec. the first period

Feb.
Mar.
Jun.
Jul.
Aug. the second period
Sep.
Oct.

0 20 40 60 80 100 120
(Dlink/Dmax)*100

−0.69, respectively), followed by Ca (R = 0.60), quality. Wide seasonal variations in temperature


EC (R = 0.59), TSS (R = −0.57), TH (R = 0.52), and stream discharge can be attributed to high
TA (R = 0.52), and TKN (R = 0.51). The season- seasonality in various water quality parameters.
correlated parameter can be taken as representing Nonsignificant correlation of pH, NO2 -N, NO3 -N,
the major source of temporal variations in water TP, SiO2 , and SO4 with season indicates the con-

Fig. 3 Dendrogram
showing clustering of
monitoring sites on the
Behrimaz Stream
Station 1
Group A

Station 2

Station 3
Group B

Station 4

30 40 50 60 70 80 90 100 110
(Dlink/Dmax)*100
550 Environ Monit Assess (2009) 159:543–553

Table 2 Loadings of experimental variables (20) on significant principal components for group A sites and group B sites
VF1 VF2 VF3 VF4 VF5
Group A sites
Q −0.832 0.191 −0.102 −0.087 −0.273
WT 0.568 0.318 −0.575 0.209 0.070
pH 0.379 0.823 −0.263 −0.048 −0.001
EC 0.339 −0.026 0.032 0.016 0.915
DO −0.518 −0.448 0.242 0.531 0.016
TH 0.827 0.221 −0.208 0.267 0.317
TA 0.595 0.526 −0.077 0.160 0.538
NH4 -N 0.494 −0.061 −0.178 0.777 0.264
TKN 0.605 0.046 −0.260 0.691 0.237
ON 0.649 0.219 −0.339 0.377 0.134
NO2 -N −0.323 0.359 0.763 0.170 0.221
NO3 -N −0.108 0.943 0.042 0.213 0.074
TP −0.064 0.841 0.307 −0.117 −0.047
SiO2 0.239 −0.033 0.928 −0.085 0.087
SO4 0.135 0.160 0.147 0.861 −0.101
Na 0.734 0.103 0.288 −0.034 0.512
K 0.838 −0.131 0.022 0.102 0.432
Ca 0.876 0.132 0.064 0.290 0.133
TSS −0.648 0.476 0.217 −0.240 0.077
TDS 0.192 0.059 0.146 0.119 0.912
Eigenvalue 8.48 3.36 2.70 1.95 1.14
% Total variance 42.42 16.80 13.54 9.79 5.74
Cumulative % variance 42.42 59.23 72.78 82.58 88.32
Group B sites
Q −0.286 0.101 −0.917 −0.151 −0.003
WT 0.944 0.044 −0.061 0.003 0.144
pH 0.570 0.008 −0.088 −0.583 −0.014
EC 0.252 −0.130 0.203 0.914 −0.001
DO −0.959 0.062 0.023 0.001 −0.157
TH 0.901 0.009 0.317 0.138 0.022
TA 0.747 0.478 0.192 0.203 0.014
NH4 -N 0.511 −0.283 0.213 0.122 0.733
TKN 0.682 −0.205 0.194 0.036 0.639
ON 0.860 0.032 0.132 −0.099 0.283
NO2 -N −0.249 0.851 −0.108 −0.252 0.048
NO3 -N 0.404 0.757 −0.391 −0.032 0.235
TP 0.283 0.872 0.067 −0.140 −0.097
SiO2 −0.319 0.489 0.578 −0.146 −0.048
SO4 0.122 0.098 −0.028 0.075 0.956
Na −0.050 0.793 0.412 0.288 −0.070
K 0.156 −0.011 0.879 −0.137 0.234
Ca 0.703 0.059 0.430 −0.167 0.453
TSS −0.142 0.748 −0.228 0.394 −0.245
TDS −0.138 0.177 −0.407 0.798 0.246
Eigenvalue 7.30 3.97 2.62 2.33 1.55
% Total variance 36.52 19.86 13.12 11.65 7.75
Cumulative % variance 36.52 56.39 69.52 81.17 88.93
Bold and italic values indicate strong and moderate loadings, respectively
Environ Monit Assess (2009) 159:543–553 551

tribution of anthropogenic sources in the catch- bearing minerals, which are found in the region
ment areas. (Özdemir 1995b). Discharge contributes nega-
tively to this factor, which can be explained by
Data structure determination and source considering that dilution processes of dissolved
identification minerals increase with discharge. VF2 (16.80% of
the total variance) has strong positive loadings
Principal component analysis/factor analysis was on nitrate nitrogen, total phosphorus, and pH
performed on the normalized data sets (20 and moderate strong positive loadings on total
variables) separately for the two different regions, alkalinity. This factor represents the contribution
viz., groups A and B, as delineated by CA tech- of nonpoint pollution and the physiochemistry of
niques, to compare the compositional pattern be- the stream. Nonpoint sources of total phospho-
tween analyzed water samples and identify the rus comprise soil erosion and water runoff from
factors influencing each one. The input data ma- croplands. Nitrate nitrogen source is due to nu-
trices (variables × cases) for PCA/FA were [20 merous sources, such as geologic deposits, natural
× 24] for group A and [20 × 16] for group B. organic matter decomposition, and agricultural
PCA of the two data sets yielded five PCs for the runoff (Madramootoo et al. 1997).
groups A and B sites with Eigenvalues >1, ex- VF3 (13.54% of the total variance) has strong
plaining 88.32% and 88.93% of the total variance positive loadings on silica and nitrite nitrogen.
in respective water quality data sets. An Eigen- This factor indicates that nitrite nitrogen source
value gives a measure of the significance of the is from domestic and agricultural wastes, whereas
factor: the factors with the highest Eigenvalues are silica is from bed rock materials. VF4 (9.79% of
the most significant. Eigenvalues of 1.0 or greater the total variance) has strong positive loadings
are considered significant (Shrestha and Kazama on sulfate and ammonium nitrogen. This factor
2007). represents the contribution of nonpoint pollution
The Scree plot was used to identify the number from agricultural areas. In these areas, farmers use
of PCs to be retained in order to comprehend the ammonium sulfate fertilizers, and the stream re-
underlying data structure (Vega et al. 1998). In the ceives ammonium and sulphate via surface runoff
present study, the Scree plot (figure not shown) and irrigation waters. VF5, explaining the lowest
showed a pronounced change of slope after the variance (5.74%), has strong positive loadings on
fifth Eigenvalue. Equal numbers of VFs were ob- total dissolved solids and electrical conductivity.
tained for two sites through FA performed on the This factor can be interpreted as the physiochem-
PCs. Corresponding VFs, variable loadings, and ical source of variability.
the variance explained are presented in Table 2. For the data set representing the group B sites,
Liu et al. (2003) classified the factor loadings as among total five significant VFs, VF1, explaining
“strong,” “moderate,” and “weak,” corresponding about 36.52% of total variance, has strong positive
to absolute loading values of >0.75, 0.75–0.50, and loadings on water temperature, total hardness,
0.50–0.30, respectively. organic nitrogen, total alkalinity, and calcium and
For the data set pertaining to group A sites, strong negative loadings on dissolved oxygen. This
among five VFs, VF1, explaining 42.42% of total factor represents the seasonal effect of temper-
variance, has strong positive loadings (>0.70) on ature, organic pollution from domestic wastes,
calcium, potassium, total hardness, and sodium and stream bed material. The inverse relationship
and strong negative loading on discharge. Thus, between temperature and dissolved oxygen is a
this factor contains hydro-geochemical variables natural process because warmer water becomes
(Ca, K, TH, and Na) originating, at a first glance, saturated more easily with oxygen, and it can
from mineralization of the geological compo- hold less dissolved oxygen. Negative relationship
nents of soils. The contribution of Ca, Na, and between organic nitrogen and dissolved oxygen
K to this factor can be considered a result of can be explained such that high levels of dis-
cation-exchange processes at soil–water interface solved organic matter consume large amounts of
(Guo and Wang 2004) and dissolution of calcium- oxygen (Singh et al. 2004). Total alkalinity and
552 Environ Monit Assess (2009) 159:543–553

total hardness are linked with common sources Simeonov et al. 2003; Singh et al. 2004; Shrestha
of natural processes of dissolution of soil con- and Kazama 2007).
stituents, mainly calcium carbonates. Lithograph
composition indicates that parent rock material
Conclusions
of this region contain high levels of Ca (Özdemir
1995b). VF2 (19.86% of the total variance) has
In this case study, different multivariate statistical
strong positive loadings on total phosphorus,
methods were used to assess temporal and spatial
nitrite nitrogen, nitrate nitrogen, sodium, and
variations in surface water quality of the Behri-
suspended solids. This factor represents the point
maz Stream. Hierarchical cluster analysis grouped
pollution, nonpoint pollution, and erosion effect.
12 months into two periods (the first and second
While point pollution is from domestic wastewa-
periods) and classified four sampling sites into two
ter, nonpoint pollution is from agricultural and
groups (A and B) based on the similarity of water
livestock farms. The erosion effect occurs during
quality characteristics. The temporal and spatial
cultivation of soil and rainfall events from upland
similarities and groupings could facilitate the de-
areas. It may be noted that predominant soils in
sign of an optimal future monitoring strategy that
the lower Behrimaz Stream Basin are lithosolic
could decrease monitoring frequency, the number
soils, which are prone to erosion particularly when
of sampling stations, and the corresponding costs.
coupled with cultivated fields, moderate to steep
The factor analysis/principle component analysis
slopes, and intense precipitation. One of the main
helped extract and identify the factors/sources re-
sources of total phosphorus in runoff is soils with
sponsible for variations in stream water quality at
high phosphorus levels. Fertilization and manure
two different sampling sites. However, FA/PCA
spreading can contribute to high levels of soil
did not result in a significant data reduction as
phosphorus. Suspended particles tend to have ad-
it points to 14 parameters (70% of original 20)
sorbed phosphorus. VF3 (13.12% of total vari-
required to explain 88% of the data variability of
ance) has strong positive loadings on potassium.
group A sites and 17 parameters (85% of original
This factor represents agricultural runoff from
20) required to explain 89% of the data variability
potassium fertilizers. VF4, explaining 11.65% of
of group B sites. Varifactors obtained from factor
the total variance, has strong positive loadings on
analysis indicate that the parameters responsible
total dissolved solids and electrical conductivity.
for water quality variations are mainly related
This factor can be interpreted as the physiochem-
to discharge, temperature, and soluble minerals
ical source of variability. VF5 (7.75% of the total
(natural) and nutrients (nonpoint sources: agri-
variance) has strong positive loadings on sulfate
cultural activities) in relatively LP areas and
and ammonium nitrogen. This factor represents
organic pollution (point source: domestic waste-
the contribution of nonpoint pollution from agri-
water) and nutrients (nonpoint sources: agricul-
cultural runoff. In these areas, the ammonium
tural activities) in MP areas in the basin. Thus,
sulfate fertilizers are used commonly.
this study illustrates the utility of multivariate sta-
In this case study, FA did not result in much
tistical techniques for analysis and interpretation
data reduction, as we still need 14 parameters
of complex data sets and, in water quality assess-
(about 70% of the 20 parameters) to explain 88%
ment, identification of pollution sources/factors
of the data variance of group A sites and 17
and understanding temporal/spatial variations in
parameters (about 85% of the 20 parameters)
water quality for effective stream water quality
to explain 89% of the data variance of group B
management.
sites (Table 2). However, FA served as a means
to identify those parameters, which have greatest
contribution to temporal variation in the stream
References
water quality. Similar approach based on FA/PCA
for evaluation of temporal and spatial variations
APHA (1995). Standard methods for the examination of
in water quality has earlier been used (Vega et al. water and waste water. Washington, DC: American
1998; Helena et al. 2000; Wunderlin et al. 2001; Public Health Association.
Environ Monit Assess (2009) 159:543–553 553

Brumelis, G., Lapina, L., Nikodemus, O., & Tabors, G. niques in hydrogeochemical studies: An example from
(2000). Use of an artificial model of monitoring data Karnataka, India. Water Research, 36, 2437–2442.
to aid interpretation of principal component analysis. doi:10.1016/S0043-1354(01)00490-0.
Environmental Modelling & Software, 15(8), 755–763. Sarbu, C., & Pop, H. F. (2005). Principal component
doi:10.1016/S1364-8152(00)00060-8. analysis versus fuzzy principal component analy-
Carpenter, S. R., Caraco, N. F., Correll, D. L., Howarth, sis. A case study: The quality of Danube water
R. W., Sharpley, A. N., & Smith, V. H. (1998). (1985–1996). Talanta, 65, 1215–1220. doi:10.1016/j.
Nonpoint pollution of surface waters with phosphorus talanta.2004.08.047.
and nitrogen. Ecological Applications, 83, 559–568. Shrestha, S., & Kazama, F. (2007). Assessment of sur-
doi:10.1890/1051-0761(1998)008[0559:NPOSWW]2.0. face water quality using multivariate statistical tech-
CO;2. niques: A case study of the Fuji river basin, Japan.
Dixon, W., & Chiswell, B. (1996). Review of aquatic moni- Environmental Modelling & Software, 22, 464–475.
toring program design. Water Research, 30, 1935–1948. doi:10.1016/j.envsoft.2006.02.001.
doi:10.1016/0043-1354(96)00087-5. Simeonov, V., Stratis, J. A., Samara, C., Zachariadis, G.,
Fent, K. (2004). Ecotoxicological effects at contaminated Voutsa, D., Anthemidis, A., et al. (2003). Assessment
sites. Toxicology, 205, 223–240. doi:10.1016/j.tox.2004. of the surface water quality in Northern Greece. Wa-
06.060. ter Research, 37, 4119–4124. doi:10.1016/S0043-1354
Guo, H., & Wang, Y. (2004). Hydrogeochemical processes (03)00398-1.
in shallow quaternary aquifers from the northern part Simeonova, P., Simeonov, V., & Andreev, G. (2003). Wa-
of the Datong Basin, China. Applied Geochemistry, 19, ter quality study of the Struma River Basin, Bulgaria
19–27. doi:10.1016/S0883-2927(03)00128-8. (1989–1998). Central European Journal of Chemistry,
Günek, H., & Yiğit, A. (1995). Hydrographic proper- 1, 136–212. doi:10.2478/BF02479264.
ties of Lake Hazar basin. In Proceedings of I. Lake Singh, K. P., Malik, A., Mohan, D., & Sinha, S. (2004).
Hazar and its environment symposium (pp. 91–103). Multivariate statistical techniques for the evaluation
Elazığ. of spatial and temporal variations in water quality of
Helena, B., Pardo, R., Vega, M., Barrado, E., Fernandez, Gomti River (India): A case study. Water Research, 38,
J. M., & Fernandez, L. (2000). Temporal evo- 3980–3992. doi:10.1016/j.watres.2004.06.011.
lution of groundwater composition in an alluvial Singh, K. P., Malik, A., & Sinha, S. (2005). Water quality
aquifer (Pisuerga river, Spain) by principal component assessment and apportionment of pollution sources
analysis. Water Research, 34, 807–816. doi:10.1016/ of Gomti river (India) using multivariate statistical
S0043-1354(99)00225-0. techniques—A case study. Analytica Chimica Acta,
Liu, C. W., Lin, K. H., & Kuo, Y. M. (2003). Applica- 538, 355–374. doi:10.1016/j.aca.2005.02.006.
tion of factor analysis in the assessment of ground- Şen, B., Koçer, M. A. T., & Alp, M. T. (2002). Some phys-
water quality in a blackfoot disease area in Taiwan. ical and chemical properties of running waters flowing
The Science of the Total Environment, 313, 77–89. into the Lake Hazar. Science and Engineering Journal
doi:10.1016/S0048-9697(02)00683-6. of Firat University, 14(1), 241–248.
Love, D., Hallbauer, D., Amos, A., & Hranova, R. (2004). Vega, M., Pardo, R., Barrado, E., & Deban, L. (1998).
Factor analysis as a tool in groundwater quality man- Assessment of seasonal and polluting effects on the
agement: Two southern African case studies. Physics qualityof river water by exploratory data analysis.
and Chemistry of the Earth, 29(15–18), 1135–1143. Water Research, 32, 3581–3592. doi:10.1016/S0043-
Madramootoo, C. A., Johnston, W. R., & Willardson, L. 1354(98)00138-9.
S. (1997). Management of agricultural drainage water Wunderlin, D. A., Diaz, M. P., Ame, M. V., Pesce, S.
quality: Water Reports 13. F., Hued, A. C., & Bistoni, M. A. (2001). Pattern
Özdemir, M. A. (1995a). Geomorphology of Lake Hazar recognition techniques for the evaluation of spatial
Basin and formation of Lake. In Proceedings of I. and temporal variations in water quality. A case
Lake Hazar and its environment symposium (pp. 121– study: Suquia river basin (Cordoba, Argentina). Water
148). Elazığ. Research, 35, 2881–2894. doi:10.1016/S0043-1354(00)
Özdemir, M. A. (1995b). Erosion problem in Lake Hazar 00592-3.
Basin (Elazığ) and its precautions. In Proceedings of I. Yiğit, A., & Çitçi, M. D. (1995). Agricultural activities in
Lake Hazar and its environment symposium (pp. 229– Lake Hazar and Behrimaz Watersheds. In Proceed-
243). Elazığ. ings of I. Lake Hazar and its environment symposium
Qadir, A., Malik, R. N., & Husain, S. Z. (2007). (pp. 153–165). Elazığ.
Spatio-temporal variations in water quality of Nullah Zhou, F., Liu, Y., & Guo, H. C. (2007). Applica-
Aik-tributary of the river Chenab, Pakistan. Environ- tion of multivariate. statistical methods to the wa-
mental Monitoring and Assessment, 140(1–3), 43–59. ter quality assessment of the. watercourses in the
doi:10.1007/s1066100798464. northwestern New Territories, Hong Kong. Environ-
Reghunath, R., Murthy, T. R. S., & Raghavan, B. R. mental Monitoring and Assessment. 132(1–3), 1–13.
(2002). The utility of multivariate statistical tech- doi:10.1007/s106610069497x.

You might also like