You are on page 1of 4

A Novel Adaptive Algorithm for Spatial Interpolation

Jiaogen Zhou1,2, Xu Chen2, Dongmei You2, Daoyou Huang1*


1
Institute of Subtropical Agriculture, The Chinese Academy of Science, Changsha, China
2
Center of Information Technology in Agriculture, Shanghai Academy of Agricultural Sciences, Shanghai, China
*Corresponding author, e-mail: dyhuang@isa.ac.cn

Abstract—The spatial distributions of soil heavy metals are not the distribution of soil heavy metal contents don not follow the
continuous with landscape heterogeneity. In the context, the spatial autocorrelation since their spatial variability is with
existing spatial interpolation methods cannot obtain accurate increase of the research scale. So, good and reasonable
estimation of contents of soil heavy metals in unknown locations. performance of interpolating unknown locations can be
In this paper, we present a novel adaptive algorithm for spatial obtained when using known data of spatial homogeneity in a
interpolation of soil heavy metals under spatial heterogeneity small scale, but the spatial distribution heterogeneity of heavy
background. The core idea of our algorithm is to impose a metal contents can lead to underestimation or overestimation of
constraint on spatial interpolation, which spatial estimation of unknown locations by above-mentioned methods in a middle or
soil heavy metals is processed in the context of spatial
large scale.
homogeneity. Our algorithm consists of extracting land patches,
polygon merging, minimum unit partitioning and interpolation Soil -landscape model theory on soil formation gives a full
operator. The experimental results on real soil data show that the consideration to the impact of environmental factors and land-
performance of our algorithm is more reliable than ordinary scape on soil properties, and seeks to make spatial prediction of
kriging method. soil properties of unknown location through soil properties and
landscape environmental factors. Zhu et al and Mcbratney et
Keywords-adaptive algorithm; spatial Interpolation; heavy
al early used soil- landscape model theory to guide soil
metals; Geographic Information System (GIS)
survey works and digital soil mapping, and made a series
of results [10-13]. This method has been extensively applied to
I. INTRODUCTION soil landscape classification and perdition of soil thickness, soil
How to accurately and effectively describe spatial non- humus, organic matter and salt contents [14-18], but there are
point-source distributions of heavy metal contents with discrete yet few research works about estimation of heavy
data of monitoring locations has been interesting and attractive metals contents using soil-landscape model. There is still a big
in the research field of soil and ecology. There are many works challenge to difficultly quantify the relationship soil heavy
involved in the hot issue. metal contents and external environment factors [19].
Classical geo-statistical theory proposed in sixties-seventies In this paper, we integrate the advantages of both regional
of last century is considered as a good method to solve the variance theory and soil landscape model method, and propose
problem [1-4]. The core idea of geo-statistical method is a novel adaptive interpolation algorithm for soil heavy metal.
regional variance, which hypothesizes that soil properties meet The core idea in our algorithm is to impose a spatial
spatial autocorrelation in a certain scale [5-6]. In the context of homogeneity constraint on interpolation, which all known
spatial autocorrelation, estimating of the soil properties of non- points to participate in estimation of an unknown point is of
monitored locations with soil data from known and monitored spatial homogeneity.
points is reasonable [7-8]. The works in many published The following of the paper is organized as follows. Section
documents shows that regional variance theory can 2 describes a collaborative change assumption for our
appropriately describe spatial variability and distribution of soil algorithm. Section 3 introduces the construction and implem-
heavy metals. Moreover, a series of spatial estimation methods entation of the adaptive algorithm. Section 4 presents
based on regional variance theory, such as ordinary kriging, experimental results on real data, and finally, Section 5 makes
universal kriging, Co-kriging, regression kriging and trend a conclusion of the paper.
surface, prove to obtain good estimation of soil heavy metal
contents of unknown locations with data of known and discrete
locations [5,7-9]. In fact, there is an Underlying assumption II. THE PROPOSED COLLABORATIVE CHANGE ASSUMPTION
imposed on above-mentioned approaches, which good and As above mentioned, traditional interpolation method make
accurate interpolation of unknown locations can be achieved an assumption that soil properties are subject to spatial
only if the data used to interpolate, from known locations autocorrelation, so they can be described by discrete and sparse
satisfies spatial homogeneity. data. However, automatically determining the spatial auto-
correlation range of soil properties is very difficult with
In fact, due to the effect of outer factors such as land use
current interpolation methods, and then it is easy to under-
types, cropping regimes and human activities, the contents of
estimate or overestimate the value of an unknown location in
soil heavy metals show significantly spatial variability in a
the region with significantly spatial variance of the distribution
middle or large scale, especially in suburbs. This means that

Supported by National water pollution control and management


technology major projects (No.2009ZX07212-001-05), Hunan science and
technology major projects (No.2009FJ1005) Shanghai Natural Science
Foundation (No.11ZR1432700), Development Fund of Shanghai Academy of
Agricultural (No. 2010-11˅

978-1-61284-848-8/11/$26.00 ©2011 IEEE


of soil properties. For example, for two neighboring arable To some extent, It is worth noting that the distributions of
plots, one of cultivating food crops, while the other one of impacting factor on heavy metal content, such as landscape
growing vegetables. Very naturally, water and fertilizer features, land use types, cropping regimes meet globally spatial
investment in the plot of planting vegetables is usually higher heterogeneity but locally spatial homogeneity. Since the
than that of growing food crops. In long run, if cropping contents of heavy metals are directly affected by above-
systems and fertilizer management practices remain unchanged, mentioned outer factors, we can make a reasonable assumption
the contents of soil properties, such as organic matter, total that the contents of heavy metals also satisfy globally spatial
nitrogen, and heavy metals, and so on in vegetable plot are heterogeneity but locally spatial homogeneity. We call it as
significant higher than ones in food crop plot. To analyze the collaborative change assumption. Based the assumption, we
spatial distribution of soil properties in the two plots, sampling find that the key to solve the crossing border interpolation
and analysis may be on the terraces, and then using some inter- question is to determine the locally spatially homogeneity
potation method to predict their spatial distribution. On regions of the distributions of heavy metal contents. Once the
using traditional interpolation method, an interesting pheno- homogeneity regions are determined, interpolation of unknown
menon appears in the adjacent zones between two plots ˈ locations is constrained to spatially homogeneous backgrounds
which relative high contents of soil properties are close to food to avoid the occurrence of cross-border interpolation.
crop zones while relative low contents of soil properties tend
to be near to vegetable zones. This is contrary to the actual III. ALGORITHM CONSTRUCTION AND IMPLENMENTION
situation. In this paper, we consider this phenomenon as a
crossing boundary interpolation problem. The further In order to solve cross-border Interpolation the problem, we
discussion of the question is as follows. make an assumption that the content distributions of soil heavy
metals keep collaborative change with the outer factors, and
propose a novel adaptive interpolation algorithm based on the
assumption. In our algorithm, the core idea is that any data of
known sites to involve in the interpolation on an unknown site
are exactly of spatial homogeneous background like that of the
unknown site. Generally, the algorithm consists of extraction
land patches, polygon merging, minimum unit partitioning and
interpolation operator.

A. Extracting Spatial Homogeous Units


In our algorithm, identifying the locally spatial homoge-
neous regions of the distribution of heavy metal contents is
very important. In fact, determining of homogenous regions is
the process of classification on the whole soil area. To
obtain accurate results of soil classification, the number of soil
types is needed in advance, but it is difficult to obtain such a
priori knowledge in most cases. So a self-convergence algo-
rithm of spectral clustering [20] is first performed to obtain the
Figure 1. Description of crossing boundary interpolation and spatial primary number of soil types, and then visual interpretation
homogeneity-based interpoaltion
operation runs to obtain the soil classes.
Fig. 1 indicates a soil area consisting of A ~H eight soil
blocks of different classes. The content distributions of heavy B. Polygon Merging
metals of different soil classes are to meet the similarity in the After soil classification, an obtained soil class probably
same class while difference between classes. The soil area is contains many small and sparse polygons. In fact, this is
partitioned into the formation of 8 grids, and then in each grid harmful to the resulting interpolation operation. So it is
unit deployment of a sampling point (black dot Shown as in necessary to merge small and sparse polygons to reduce the
Fig. 1 above). Supposing that heavy metal contents in soil units computational complexity. We use human-computer interact-
of C and G over heavy metal pollution level while others not tin model to solve the problem. In the model, users
excessive, we see that the content distributions of heavy metals can determine whether two adjacent polygons are merged or
in soil units of A and B are relatively homogeneous. As the not in terms of their prior knowledge. If true, a new soil class is
content distributions of heavy metals show spatial found. Very naturally, although the classification results are
heterogeneity, the interpolation results will become unreliable directly influenced by prior knowledge of users, the more
in the soil area. For example, when interpolating process is users' prior knowledge is of abundance, the more accurate
over, heavy metal contents in units of F and H close to the results.
contaminated unit C will be relatively high, and in contrast,
ones in the unit C adjacent to the non-pollution unit A will C. Minimum Inperpoaltion Unit Partitioning
be relatively below. This is what we call crossing boundary Minimum unit partitioning is a process that the whole soil
problem. area is divided into regular grids through a certain scale and the
resulting grid units is called as minimum interpolation unit.
The gridding size is better than sampling one. If a grid unit building areas (shown as in Fig. 2). In comparison,
contains a sampling point, it is named as participating unit, and interpolation in our method is more reliable than that of
if not called as interpolating unit. Additionally, a new field is ordinary kriging.
added to all grid units. For participating units, the field values
equal to the heavy metal contents of corresponding points,
while for interpolation units, let they null. Finally, the field
values of participating units are used to estimate ones of
interpolation units.

D. Interpolation Operator
Interpolation on the heavy metal contents in the unknown
sites is based on the classification of soil unit classes. This
means that soil unit class map is used to classify sampling
points. For any points are located in the same soil class, they
will be set in the same category. Giving a soil class t and an
unknown location xt 0 , its estimation value Z ( xt 0 ) satisfies:

Z ( xt 0 ) Z ( xt )  r ( xt 0 ), (1)

1 n(t )
Z ( xt ) ¦ Z ( xtj ),
n(t ) i 1
(2)
n(t )
r ( xt 0 ) ¦O
j 1
tj
( Z ( xtj )  Z ( x t )). (3)
(a)

Here, Q W stands for the number of samplers located in


soil class, Z ( xt ) for the mean of samplers, Z ( xtj ) for the value
of the sampler x tj , r ( xt 0 ) for the residual, and finally O ij for
weight coefficient.
IV. EXPERIMENT

A. Data Preparation
In this paper, all algorithms are encoded using vb.net
language, and run in our independent platform for secondary
development of GIS. The real experimental data consists of
heavy metal Cd contents of 1520 sampling points collected in
the three years of 2006-2008 in Daxing district of Beijing city,
China. Two remote images of 30m resolution TM and Beijing-
1 satellite image of 100m resolution are merged with ENVI
software, and then a spectral clustering method is performed on
the merged image to obtain the primary number of soil types.
(b)
B. Comparation of Our Algorithm with Kriging
Figure 2. Comparation of our algprithm with kriging methd
Comparasion of our algorithm with ordinary kriging
method is conducted with 1520 heavy metal Cd data from
Daxing district of Beijing. Due to all sampling points collected V. CONCLUSION
from arable land, but not residential and building areas, so In this paper, we have introduced and discussed the
interpolation in residential and building areas is not reasonable crossing boundary interpolation problem. A novel interpolation
and reliable. Apparently, ordinary kriging method cannot solve algorithm based on a collaborative change assumption has been
the cross-border interpolation question. Apparently, the proposed to solve the question. The core idea of the algorithm
conduction of ordinary kriging results in interpolation over the consists of extracting spatially homogenous soil classes,
entire Daxing area, including the residential and building areas merging polygon, minimum interpolation unit partitioning and
with no monitoring data (shown as in Fig. 2). It is very possible interpolation operator. The experiment on real data shows that
to over-estimation or under-estimate their Cd contents in our algorithm is more reliable than ordinary kriging method.
absence of monitoring data from residential and building areas. The experimental comparisons of our algorithm with other
However, the implementation of our algorithm results in interpolation methods will be conducted in next step.
interpolation only over arable areas and not residential and
ACKNOWLEDGMENT [10] A. Zhu, L. Band, and R. Vertessy, “Derivation of soil properties using a
soil land inference model (SoLIM),” Soil Science society of America
The authors are very grateful to those who collected and Journal, vol. 61, 1997, pp. 523-533.
analyzed the data of soil heavy metal Cd used in the paper. [11] A. Zhu, B. Huson, and J. Burt, “Soil mapping using GIS, expert
knowledge, and fuzzy logic,” Soil Science society of America Journal,
vol. 65, 2001, pp. 1463-1472.
REFERENCES
[12] A. Zhu, “Mapping soil landscape as spatial continua: the neural network
[1] B. Palumbo, M. Angelone, and A. Bellanca, “Influence of inheritance method,” Water Resource Research, vol. 36, 2000, pp. 663-677.
and pedogenesis on heavy distribution in soil of Scily, Italy,” Geoderma, [13] A. Mcbratney, M. Mendonca, and B. Minasy, “On digital soil mapping,”
vol. 95, 2000, pp. 247-266. Geoderma, vol. 117, 2003, pp. 3-52.
[2] Y. Lin and T. Chang, “Simulated annealing and kriging method for [14] B. Hudson, “The soil survey as paradigm-based science,” Soil Science
identifying the spatial patterns and variability of soil heavy metal,” J. Society of America Journal, vol. 56, 1992, pp. 826-841.
Environ. Sci. Health, Part A, Toxic/Hazard, vol. 35(7), 2000, pp. 1089-
1115. [15] O. Odelh, A. Mctratney, and D. Chittleborough, “Spatial prediction of
soil properties from landform attribute derived from a digital elevation
[3] A. Facchinelli, E. Sacchi, and L. Mallen, “Multivariate statistical and model,” Geoderma, vol. 63, 1994, pp. 197-214.
gis-based approach to identify heavy metal sources in soils,”
Environmental pollution, vol. 114, 2001, pp. 313-324. [16] F. Carre, A. Mcbratney, T. Mayr, and L.Montanarella, “Digital soil
assessments: Beyond DSM,” Geoberma, vol. 142, 2007, pp. 69-79.
[4] C. Zhang, “Using multivariate analysis and GIS to identify pollutants
and their spatial patterns in urban soils in Galway, Ireland,” Environ. [17] S. Lesch and D. Corwin, “Prediction of spatial soil property information
Pollut, vol. 142, 2006, pp.501-511. from ancillary sensor data using ordinary linear regression: Model
derivations, residual assumptions and model validation tests,”
[5] P. Goovaerts, “Geostatistical modeling of uncertainty in soil science,” Geoderma, vol. 148, 2008, pp. 130-140.
Geoderma, vol. 103, 2001, pp. 3-26.
[18] L. Grinand, D. Arrouays, and B. Laroche, “Extrapolating regional soil
[6] C. Zhang, D. Fay, and D. Mcgraph, “Statistical analysis of geochemical landscapes from an existing soil map: Sampling intensity, Validation
variables in soils of Ireland,” Geoderma, vol. 146, 2008, pp. 378-390. procedures, and integration of spatial context,” Geoderma, vol. 143,
[7] J. Cattle and B. Mcbratney, “Kriging method evaluation for assessing 2008, pp. 180-19.
the spatial distribution of urban soil lead contamination,” Environ Qual, [19] D. You, J. Zhou, J. Wang, and Z. Ma, “Analysis of Relations of Heavy
vol. 31, 2002, pp. 1576-1588. Metal Accumulation with Land utilization Using Positive and Negative
[8] A. Castrignano and G. Buttafuoco, “Geostatistical stochastic simulation Association Rule Method,” Mathematical and Computer Modelling,
of soil water content in a forested area of south Italy,” Biosyst. Eng., DOI:10.1016/j.mcm. 2010.11.028.
vol.87(2), 2004, pp. 257-266. [20] J. Li, J. Zhou, W. Huang, J. Zhang, and X. Yang, “Grouping objects in
[9] C. Wu, J. Wu, and Y. Luo, “Statistical and geostatistical characterization multi-band images using improved eigenvector-based algorithm,”
of heavy metal concentrations in a contaminated area taking into account Mathematical and computer Modeling, vol. 51, 2010, pp. 1332-1338.
soil map units,” Geoderma, vol. 144, 2008, pp. 171-179.

You might also like