Bai Giang TKMT

=<;:
?@; >
;A
EDC B
FHG>

>
F >I
LJKI

; ( &'
)

*+
M
-.,

01/
432

*+
+5
!
6
43 7 "

938
M #

N
N "
PON
$
QR %
O =;

SF
O T
?U
Introduction to Geostatistics
V
XYW
Z
Centre de Géostatistique - EMP
Founded 1968 by: Georges M ATHERON
Director: Jean-Paul C HILÈS
Permanent staff: 14 research scientists

Funding (salaries): 60% by contract research,
,→ this conditions an application driven research

XYW
V
CG: Main application fields
Petroleum exploration, mining

Environmental sciences, climatology
Health: epidemiology
Fisheries, demography . . .
Software products: Isatis, Heresim. . .

sold by Geovariances International (www.geovariances.fr)
Also:
Bioinformatics group (SVM, kernel methods)

XYW
V
Geostatistics worldwide
Other groups:
Stanford (petroleum), Trondheim (petroleum), Calgary
(mining, petroleum), Brisbane (mining), Johannesburg
(mining), Valencia (hydrogeology),. . .
Main meetings:
International Geostatistics Conference:
1st in Rome (1975), . . . , 7th in Banff (2004)
−→ 2008: Santiago de Chile
geoENV (european geostatistics conference for
environmental applications):
1st in Lisbon (1996),. . . , 5th in Neuchatel (2004)
−→ 2006: Greece
Software:
R (www.r-project.org), −→ www.ai-geostats.org

XYW
V
Geostatistics
definition

XYW
V
Geostatistics
is an application of
the Theory of Regionalized Variables
(usually considered as realizations of Random Functions)
to geology and mining (fifties)

to natural phenomena in general (seventies)
(re-)integrated mainstream statistics (nineties)

XYW
V
Concepts
Variogram: description of the spatial/temporal correlation of

a phenomenon
Kriging: optimal linear prediction method for estimating

values of a phenomenon at any location of a region
(→ D. G. K RIGE)
Conditional Simulation: stochastic simulation of

realizations, conditional upon the data.

XYW
V
Basic Statistics
concepts

XYW
V
Center of mass
Seven weights w are hanging on a bar whose own weight is
negligible:
center of mass
5 6 7 8
elementary weight v
weight w

XYW
V
Center of mass
The weights w are suspended at points:
z = 5, 5.5, 6, 6.5, 7, 7.5, 8,
The mass w(z) of the weights is
w(z) = 3, 4, 6, 3, 4, 4, 2.
The location z where the bar, when suspended, stays in

equilibrium is:
7
1 X
z = zk w(zk )
P
w(zk ) k=1
k
Z

XYW
V
Center of mass
Defining normed weights:
w(zk )
p(zk ) =
P
w(zk )
k
P
with p(zk ) = 1, we can write:
k
7
X
z = zk p(zk )
k=1
ZZ

XYW
V
Center of mass
The weights w(zk ) are subdivided into n elementary weights vα :
center of mass
5 6 7 8
elementary weight v
weight w
with corresponding normed weights pα = 1/n:

n 26
X 1 X
z = z α pα = zα = 6.4
α=1
26 α=1
Z

XYW
V
Center of mass
The average squared distance to the center of mass
n
2 1X
dist (z) = (zα − z)2 = .83
n α=1
gives an indication about the dispersion of the around the center

of mass z.
Z

XYW
V
Histogram
5 6 7 8
mean m*
The mean value m? of data zα is equivalently,

n
1 X
m? = zα
n α=1
Z

XYW
V
Histogram
The average squared deviation from the mean is the variance
n
1 X
s2 = (zα − m? )2
n α=1
Its square-root is called the standard deviation.
The normalized weights p(zk ) are the frequencies of the

occurence of the values z = 5, 5.5, 6, 6.5, 7, 7.5, 8.
n is the number of samples.
Z

XYW
V
Cumulative histogram
An alternate way to represent the frequencies of the values z is
to cumulate them from left to right:
CUMULATIVE
FREQUENCY
Z
1 2 3 4 5 6 7 8
Z

XYW
V
Probability distribution
Suppose we draw randomly values z from a set of values Z.
We call Z a random variable and z its realizations, z ∈ R.
The mathematical idealization of the cumulative histogram

is the probability distribution function F (z) defined as:
F (z) = P (Z < z)
The probability P (Z < z) indicates the theoretical frequency

of drawing a realization lower than a given value z.
Z

XYW
V
Probability density
We shall only consider differentiable distribution functions.
The derivative of the probability distribution function is the

probability density p(z):
F (dz) = p(z) dz
Properties:
0 ≤ p(z) ≤ 1
Z
p(z) dz = 1
Z

XYW
V
Expected value
The idealization of the concept of mean value is the
mathematical expectation:
Z

E Z = z p(z) dz = m.
z∈R
The expectation is a linear operator.

Let a, b be constants:

E a = a, E bZ = bE Z ,
so that

E a + bZ = a + bE Z
Z

XYW
V
Variance
The second moment of the random variable Z is:
Z
2
E Z = z 2 p(z) dz
z∈R
The variance σ 2 is defined as:

h 2 i
2
var(Z) = E Z − E Z = E (Z − m) = σ2
Alternate expression: multiplying out we get

2 2

var(Z) = E Z + m − 2mZ
and, as the expectation is a linear operator,

2
2
var(Z) = E Z − E Z

XYW
V
Covariance
Covariance σij between Zi and Zj :

cov(Zi , Zj ) = E Zi − E Zi · Zj − E Zj

= E (Zi − mi ) · (Zj − mj ) = σij
where mi and mj are the means of the random variables.
Covariance of Zi with itself:

2

σii = E (Zi − mi ) = σi2
Correlation coefficient:
σij
ρij = q
σi2 σj2
Z

XYW
V
Linear regression

V
YXW

Regression line
z1
z *1 = a z 2 + b
●
●
●
● ●
● ●●
● ●
● ●
●
● ●
● ●●
● ●●● ● ● ●
● ●● ●● ●
● ●
● ●●●
● ● ●●● ● ● ●
● ●
● ●
● ●
● ●
●●●
● ●●
●
●●● ● ●● ●
● ●●●●● ●
●
● ●● ● ●
m*1 ● ●●●
● ● ●● ● ●●
● ● ● ●
●
●
● ● ● ●● ●
●● ● ● ●
● ●
●
●●● ●●● ● ●●●●●●
●
●● ●● ●● ● ●
●
●●●● ●
● ● ● ● ●●
● ●●● ●●
● ● ●
● ● ●●●
● ●
● ●●● ●● ● ●● ●
● ● ●● ●
● ● ●●●●
● ●● ●● ●
● ● ● ●
● ●●
●●
m*2 z2

YXW
V
Optimal regression line
Two variables with experimental covariance:
n
1X α
s12 = (z1 − m?1 ) · (z2α − m?2 )
n α=1
The regression line is: z1? = a z2 + b

with slope a and intercept b.
Minimizing the quadratic distance:
n
1 X
dist2 (a, b) = (z1α − a z2α − b)2
n α=1
we get
s12
a = 2 b = m?1 − a m?2
s2

YXW
V
Optimal regression line
s12
z1? = (z 2 − m ?
2 ) + m ?
1
s22
s1
= m1 + r12 (z2 − m?2 )
?
s2
At the minimum the squared distance is:

dist2min (a, b) = s21 1 − (r12 ) 2

YXW
V

W

Multiple linear regression
V
YXW

Multivariate data set
The data matrix Z with n samples of N variables:
 V ariables 
z11 . . . z1i . . . z1N
 .. .. .. 
Samples  . . . 
 
zα1 . . . zαi . . . zαN 
 
 .. .. .. 
 . . . 
zn1 . . . zni . . . znN

YXW
V
Matrix of means
Define a matrix M with the same dimension n × N as Z,
replicating n times in its columns the mean value of each
variable:
 
m?1 ... m?i ... m?N
 .. .. .. 
 . . . 
 ? ? ?

M = m . . . m . . . m 
 1 i N
 .. .. .. 
 . . . 
m?1 . . . m?i . . . m?N

YXW
V
Centered variables
A matrix Zc of centered variables is obtained by subtracting M
from the raw data matrix:
Zc = Z − M

YXW
V
Variance-covariance matrix
The matrix V of experimental variances and covariances is:
 
var(z1 ) . . . cov(z1 , zj ) . . . cov(z1 , zN )
 .. ... .. 
 . . 
1 >  
V = Zc Zc =  cov(zi , z1 ) . . . var(zi ) . . . cov(zi , zN ) 
n 
 .. . ..


 . . . . 
cov(zN , z1 ) . . . cov(zN , zj ) . . . var(zN )
 
s11 . . . s1j . . . s1N
 .. .. .. 
 . . . 
 
=  si1
 . . . sii . . . siN  
 .. ... .. 
 . . 
sN 1 . . . sN j . . . sN N

YXW
V
Multiple linear regression
For a regression of z0 on the N variables from n samples
we have the matrix equation
z?0 = m0 + (Z − M) a
The squared distance between z0 and the hyperplane is:
2 1
dist (a) = (z0 − z?0 )> (z0 − z?0 )
n
= var(z0 ) + a> Va − 2 a> v0 ,
where v0 is the vector of covariances

between z0 and zi , i = 1, . . . , N .
Z

YXW
V
Minimizing the squared distance
The minimum is found for:
∂ dist2 (a)
=0 ⇐⇒ 2 Va − 2 v0 = 0 ⇐⇒ Va = v0
∂a
This system of linear equations:
    
var(z1 ) . . . cov(z1 , zN ) a1 cov(z0 , z1 )
 .. . . ..   .. 
=
 .. 
 . . .  .   . 
cov(zN , z1 ) . . . var(zN ) aN cov(z0 , zN )
has exactly one solution,

if the determinant of V is different from zero.
The squared distance at the minimum is:
dist2min (a) = var(z0 ) − a> v0

YXW
V
Simple kriging

W

V
YXW
Spatial data
Data points xα and the estimation point x0 in a spatial domain D
●
●
●
D
x0
❍
● ●
xα
●
●
●
● ●
● ●
● ●
●
●
●

W

YXW
V
Translation invariance
The expectation and the covariance are both assumed
translation invariant over the domain,
i.e. for any vector h between points x and x+h:

E Z(x+h) = E Z(x) = m

cov Z(x+h), Z(x) = C(h)

The expectation E Z(x) has the same value m
at any point x of the domain D.
The covariance between any pair of locations
depends only on the vector h.

W

YXW
V
Known mean
We assume the mean m is known
and build the estimator:
n
X
Z ? (x0 ) = m + wα Z(xα ) − m
α=1
n
X
i.e. Z ? (x0 ) − m = wα Z(xα ) − m
α=1
which is implicitly without bias:

h i n
X h i
E Z ? (x0 ) − m = wα E Z(xα ) − m = 0
α=1

W

YXW
V
Simple kriging equations
The kriging equations with known mean are simple:
n
X
wβSK C(xα −xβ ) = C(xα −x0 ) for α= 1, . . . , n
β=1
i.e.
the linear combination of weights with
the covariances between a data point
and the other data points
=
the covariance between that data point
and the point to estimate.
The variance of the Simple Kriging estimate is:

n
X
2
σSK = σ2 − wαSK C(xα −x0 )
α=1

W

YXW
V
Simple kriging: a multiple linear regression
Simple kriging is a multiple linear regression between spatial
random variables.
Like: Va = v0 , we have: Cw = c0
Writing out the equation system:

  
var(Z(x1 )) . . . cov(Z(x1 ), Z(xN )) w1
 .. . . ..   .. 
 . . .  . 
cov(Z(xN ), Z(x1 )) . . . var(Z(xN )) wN
 
cov(Z(x0 ), cov(Z(x1 ))

=  .
. 
. 
cov(Z(x0 ), Z(xN ))

W

YXW
V

and random function

Regionalized variables

V
YXW
The concept of a Random Function
Consider a domain D with points x:
x
●
Let Z(x) be a random variable at a location x ∈ D.

The family of random variables
n o
Z(x); x ∈ D
is called a Random Function.

YXW
V
Regionalized Variable
The regionalized variable z(x) is the spatial variable of interest
(“reality”).
Data does not generally allow a deterministic reconstruction of

the regionalized variable.

The regionalized variable z(x) is considered as a realization

(draw) of a random function Z(x).
For a given data set, different realizations containing the data

are equally plausible to represent the regionalized variable.

Z

YXW
V
Epistemological Problem
We possess data about only one realization:

how can we specify the random function?
Objective quantities that describe the regionalized variable and

conventional parameters that are constitutive of the model have
to be distinguished.
The quantities are estimated from data,
but the parameters are chosen.

−→ G “Estimating and Choosing”

YXW
V
Variogram
definition

YXW
V
The Variogram

x1
The vector x = : coordinates of a point in 2D.
x2
Let h be the vector separating two points:
xβ ●
h
●
xα
We compare sample values z at a pair of points with:

2
z(x + h) − z(x)
2

YXW
V
The Variogram Cloud
Variogram values are plotted against distance in space:
2
| z(t+h) - z(t) |
2
●
● ●
●●
● ●
● ● ●●
●● ●●
● ● ● ● ●
● ●
● ● ● ●
●
● ● ● ●● ● ● ● ●
● ●● ●
●●
● ● ● ●
● ● ●
●
● ●
●
● ● ●
●
●●● ●
● ●
●●
●●● ●
●
●●
● ●●
|h|

YXW
V
The Experimental Variogram
Averages within distance (and angle) classes k are computed:
γ∗(h )
k
●
● ●
●●
● ●
● ●●
●● ●● ●
● ● ● ● ●
● ●
● ● ●
●
● ● ● ●● ● ● ● ● ●
● ●● ●
●●
● ● ● ●
● ● ●
●
● ●
●
● ● ●
●
●●● ●
● ●
●●
●●● ●
●
●●
● ●●
|h|
h1 h2 h3 h4 h5 h6 h7 h8 h9

YXW
V
The Theoretical Variogram
A theoretical model is fitted:
γ (h)
|h|

YXW
V
Intrinsic Hypothesis
The first two moments of the increments are assumed stationary
(translation-invariant):
the expectation does not depend on x

h i
E Z(x+h) − Z(x) = 0
the variance depends only on h

h i
var Z(x+h)−Z(x) = 2 γ(h)
This type of stationarity is called intrinsic.

,→ The stationarity of the increments does not imply the
stationarity of Z.

YXW
V
Definition of the Variogram
By the intrinsic hypothesis:
1 h 2 i
γ(h) = E Z(x+h) − Z(x)
2
Properties
- zero at the origin γ(0) = 0
- positive values γ(h) ≥ 0
- even function γ(h) = γ(−h)
Regionalized variable Behavior at the origin

←→ continuous and differentiable

←→ not differentiable

←→ discontinuous

YXW
V
Variogram and Covariance Function
The covariance function is defined as:
h i
C(h) = E Z(x) − m · Z(x+h) − m
where stationarity of the first two moments of Z is assumed.
A variogram can be constructed from any covariance function:
γ(h) = C(0) − C(h)
Conversely, however, only if the variogram is bounded does a

corresponding covariance function C(h) exist.
The variogram characterizes a larger class of random functions.

This is why it is preferred in geostatistics.

YXW
V
Variogram
examples
Z

YXW
V
Power variogram
γ(h) = |h|p , 0<p≤2
Power model
5
4
p=1.5
p=1
VARIOGRAM
2 3
p=0.5
1
0
-4 -2 0 2 4
DISTANCE

YXW
V
Spherical covariance function
3

3 |h| 1 |h|
C(h) = − |h|≤a
2 a 2 a3

YXW
V
Exponential covariance function

|h|
C(h) = exp −
a

YXW
V
Gaussian covariance function
2

|h|
C(h) = exp − 2
a

YXW
V
Cardinal sine covariance function

|h|
sin a
C(h) = |h|
a

YXW
V
Geometric anisotropy
of the variogram

XYW
V
In practice the range of the variogram may change depending
on the direction:
h2
h’2 h’1
h1
Correction:

0 cos θ sin θ
rotation h = Qh of angle θ where Q =
− sin θ cos θ
linear transformation of the coordinates h0 = (h01 , h02 )

XYW
V
Rotation in 3D
In 3D the rotation is obtained by a composition of elementary
rotations:
   
cos θ3 sin θ3 0 1 0 0 cos θ1 sin θ1 0
   
Q = − sin θ3 cos θ3 0  cos θ2 sin θ2 0 − sin θ1 cos θ1 0
    

0 0 1 − sin θ2 cos θ2 0 0 0 1
where θ1 , θ2 , θ3 are Euler’s angles.

XYW
V
2D example: Ebro river vertical section
0.
-1.
Depth (Meter)
-2.
-3.
-4.
-5.
-6.
-10. -5. 0.
Ebro river (Kilometer)
185 Hydrolab Surveyor III conductivity measurements

XYW
V
2D conductivity variogram model
1250.
D2
M2
Variogram : CONDUCTIVITY 1000.
750.
D1
M1
500.
250.
0.
0. 1. 2. 3. 4. 5. 6.
Distance (D1: km; D2: m)
Experimental variogram for D1=horizontal, D2=vertical.

Anisotropic cubic variogram model in both directions (M1, M2).
Abscissa scale: kilometers for D1 and meters for D2.

Z

XYW
V
Behavior at the origin
of the variogram

XYW
V
Ebro river: water samples
0.
-1.
Depth (m)
-2.
-3.
-4.
-5.
-6.
-15.0 -12.5 -10.0 -7.5 -5.0 -2.5
0.
-1.
Depth (m)
-2.
-3.
-4.
-5.
-6.
-15.0 -12.5 -10.0 -7.5 -5.0 -2.5
Distance from mouth (km)
47 water samples (top) 185 conductivity values (bottom)

XYW
V
Nitrate variogram: which behavior at origin?
3000. 3000.
CUBIC EXPONENTIAL
Variogram: NITRATE
Variogram: NITRATE
D1 D1
2000. 2000.
M2
D2 D2
M2
1000. 1000.
M1
M1
0. 0.
0. 1. 2. 3. 0. 1. 2. 3.
Lag (RED: m; BLACK: km) Lag (RED: m; BLACK: km)
Nitrate experimental variogram with two alternate models.

XYW
V
Cubic variogram: conditional simulations
0.
Depth (m)
-1.
-2.
-3.
-4.
-15.0 -12.5 -10.0 -7.5 -5.0 -2.5
>=124.8
0. 117
109.2
101.4
Depth (m)
-1. 93.6
85.8
78
-2. 70.2
62.4
54.6
46.8
-3. 39
31.2
23.4
-4. 15.6
7.8
<0
-15.0 -12.5 -10.0 -7.5 -5.0 -2.5 M
0.
Depth (m)
-1.
-2.
-3.
-4.
-15.0 -12.5 -10.0 -7.5 -5.0 -2.5


XYW
V
Exponential model: conditional simulations
0.
Depth (m)
-1.
-2.
-3.
-4.
-15.0 -12.5 -10.0 -7.5 -5.0 -2.5
>=124.8
0. 117
109.2
101.4
Depth (m)
-1. 93.6
85.8
78
-2. 70.2
62.4
54.6
46.8
-3. 39
31.2
23.4
-4. 15.6
7.8
<0
-15.0 -12.5 -10.0 -7.5 -5.0 -2.5 M
0.
Depth (m)
-1.
-2.
-3.
-4.
-15.0 -12.5 -10.0 -7.5 -5.0 -2.5


XYW
V
Kriging of the mean
of a random function

YXW
V
Spatially Correlated Data
Sample locations xα in a geographical domain:
●
●
●
● ●
●
●
●
● ●
● ●
● ●
●
●
●
With spatial correlation we need to consider that:

sample points have a different number of immediate
neighbors,
distances to neighboring points play a role.
How should samples be weighted in an optimal way?

YXW
V
Estimation of the Mean Value
Using the formula of the arithmetic mean:
n
1 X
M? = Z(xα )
n α=1
1
all samples get the same weight:
n
We rather need an estimator:

n
X
M? = wα Z(xα )
α=1
with weights wα reflecting the spatial correlation.

YXW
V
Stationary random function
We assume translation-invariance of mean and covariance:
h i
∀ x ∈ D : E Z(x) = m; ∀ xα , xβ ∈ D : C(xα , xβ ) = C(xα −xβ ).

The estimation error in our statistical model:
?
| M
{z } − | {z
m }
estimated value true value
should be zero, on average:

h i
E M? − m = 0

YXW
V
No bias
No bias is obtained using weights of unit sum:
n
X
wα = 1
α=1
Consider:
h i n
hX i
E M? − m = E wα Z(xα ) − m
α=1
n
X h i
= wα E Z(xα ) −m
α=1 | {z }
m
n
X
= m wα −m = 0
|α=1{z }

Z

YXW
1
V
Variance of the estimation error
The variance σE2 of the estimation error is:
h i h i2
var(M ? − m) = E (M ? − m)2 − E M? − m
| {z }
0
h i
= E M ? 2 − 2 mM ? + m2
X n
n X h i
= wα wβ E Z(xα ) Z(xβ )
α=1 β=1
n
X h i
−2 m wα E Z(xα ) +m2
α=1 | {z }
m
n X
X n
⇒ σE2 = wα wβ C(xα − xβ )
α=1 β=1

YXW
V
Minimal estimation variance
We want weights wα that produce a minimal estimation variance:
n
X
minimum of var(M ? − m) subject to wα = 1
α=1

The objective function ϕ has n+1 parameters:
n
X
ϕ(w1 , . . . , wn , µ) = var(M ? − m) − 2 µ wα − 1
α=1
with µ a Lagrange multiplier. Setting partial derivatives to zero:
∂ϕ(w1 , . . . , wn , µ) ∂ϕ(w1 , . . . , wn , µ)
∀α : = 0, =0

YXW
V
Kriging equations
The method of Lagrange yields the equations for
the optimal weights wαKM of the kriging of the mean:
 n
 X
 KM


 w β C(xα − xβ ) − µKM = 0 for α = 1, . . . , n
 β=1
 n
X



 wβKM = 1

β=1
The variance at the minimum:

2
σKM = µKM
is equal to the Lagrange multiplier.

YXW
V
Case of no autocorrelation
When the covariance model is a pure nugget-effect:
2
σ if xα = xβ
C(xα − xβ ) =
0 if xα 6= xβ
the kriging of the mean system simplifies to:

 KM 2

 w α σ = µKM for α = 1, . . . , n

Xn
KM


 w β = 1
β=1
1
The solution weights are all equal: wαKM =
n
n
? 1X 2 1
⇒ M = Z(xα ) the arithmetic mean! µKM = σKM = n
σ2
n α=1

YXW
V
Ordinary Kriging
at a point in the domain

YXW
V
Estimation at a Point
Sample locations xα (dots)
in a domain D:
●
x0
●
●
● ●
● ●
●
● ●
● ●
●
● ●
●
●
●
We wish to estimate a value Z ? at a point x0 .

YXW
V
Ordinary kriging
The estimate Z ? is a weighted average of data values Z(xα ):
n
X n
X
Z ? (x0 ) = wα Z(xα ) with wα = 1
α=1 α=1
The weights wαOK of the Best Linear Unbiased Estimator (BLUE)

are solution of the system:
 n
 X


 wβOK γ(xα −xβ ) + µOK = γ(xα −x0 ) ∀α

β=1
n
X




 wβOK = 1
β=1
n
X
2
Minimal variance: σOK = µOK + wαOK γ(xα −x0 )
α=1

YXW
V
Cross-validation
leaving one out and reestimating it
YXW

V
Cross-validation
Comment: the sound way to cross-validate is to leave out half
of the data locations and to re-estimate them from the other
half : this requires many data! For that reason it is often done in
the following way (implemented in sotware packages). . .
A data value Z(xα ) is left out and a value Z ? (x[α] ) is estimated at

location xα by ordinary kriging.
The notation [α] means that the sample at xα has not been used
for estimating Z ? (x[α] ).
The difference between the data value and the estimated value:
Z(xα ) − Z ? (x[α] )
gives an indication of how well the data value fits into the
neigborhood of the surrounding data values.

YXW

V
Average cross-Validation error
If the average of the cross-validation errors is not far from zero:
n
!
1 X
Z(xα ) − Z ? (x[α] ) ∼=0
n α=1
then there is no systematic bias.
A negative (positive) average error represents

systematic overestimation (underestimation).
Z

YXW

V
Standardized cross-validation error
The kriging standard deviation σK represents the error predicted
by the model.
Dividing the cross-validation error by σK allows to compare the

magnitudes of both errors:
Z(xα ) − Z ? (x[α] )
σKα

YXW

V
Average squared Standardized Errors
If the average of the squared standardized cross-validation
errors is not far from one:
!2
n
Z(xα ) − Z ? (x[α] )
1 X ∼
2 =1
n α=1 σKα
then the actual estimation error is equal on average to the error

predicted by the model.
This quantity gives an idea of the adequacy of the model and of

its parameters.

YXW

V
Mapping with kriging
on a regular grid
with irregularly spaced data

W
W

YXW
V
Kriging for interpolation
Kriging is an estimation method.
It is not the quickest method to make an interpolation on a
regular grid for generating a map.
Its advantages are:
Kriging integrates the knowledge

gained from analysing the spatial structure:
the variogram.
Kriging interpolates exactly: when a sample value is
available at the location-of-interest, the kriging solution is
equal to that value.
Kriging provides an indication of the estimation error: the
kriging variance.

W
W

YXW
V
Generating a map
A regular grid is defined by the computer and
at each node of this grid a value is kriged.
● x0
❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ● ❍ ❍ ❍
❍ ❍ ❍ ❍ ❍ ●❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍
❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍
❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍
❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍
● ●
● ●
●
● ●
● ●
●
● ●
●
●
●
Afterwards a graphical representation of this grid is performed,

as a raster of colour squares, as an isoline map, as a bloc
diagram. . .

W
W

YXW
V
Moving Neighborhood
If all data are used: this is called a unique neighborhood.
Using a subset of close data points: a moving neighborhood.
●
x0
❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ● ❍ ❍ ❍
❍ ❍ ❍ ❍ ❍ ●❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍
❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍
❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍
❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍ ❍
● ●
●
●
●
● ●
● ●
● ●
●
●
●
To choose the size of the moving neighborhood, the range of the

variogram can give an indication.

W
W

YXW
V
Kriging weights
The shape of the kriging weights

XYW
V
Kriging weights
2
Nugget-effect model σOK = 1.25
25%
●
25% 25%
● ❍ ●
L
25%
●

XYW
V
Isotropic variogram
2
Spherical model with range a/L = 2 σOK = .84
40.6%
●
9.4% 9.4%
● ❍ ●
L
40.6%
●
2
Gaussian model with range a/L = 1.5 σOK = .30
49.8
●
0.2% 0.2%
● ❍ ●
L
49.8%
●

XYW
V
Spherical with isotropic range
25%
●
25% 25%
● ❍ ●
L
25%
●
Spherical with horizontal a/L = 1.5 and vertical a/L = .75

17.6%
●
32.4% 32.4%
● ❍ ●
L
17.6%
●

Z

XYW
V
Relative position of samples
2 2
σOK = .45 σOK = .48
33.3% 33.3%
● ●
❍ 37.1% ● ❍ ● 37.1%
● ●
33.3% 25.9%
The left configuration gives a more reliable estimate.

XYW
V
The screen effect
2
Spherical model with range a/L = 2 σOK = 1.14
65.6% 34.4%
● ❍ ●
A B
2
σOK = 0.87
49.1% 48.2% 2.7%

● ❍ ● ●
A C B
Adding the sample C screens off the sample B.

XYW
V
Nested variogram
and corresponding linear model

of the random function

YXW

V
Nested Variogram Model
A nested variogram γ(h) is composed of
a sum of elementary variograms γu (h)
with u = 0, . . . , S:
S
X
γ(h) = γ0 (h) + . . . + γS (h) = γu (h)
u=0
Each variogram γu (h) is build up with a normed variogram gu (h)

multiplied with a coefficient bu (sill, slope):
S
X
γ(h) = bu gu (h)
u=0

YXW

V
Example: Arsenic in soil (Loire, France)
285.
280.
275.
270.
river
Loire
310. 315. 320. 325. 330. 335. 340.
35×25 km2 region. Dots are proportional to sample value.

YXW

V
Example: Nested Variogram Model
A nugget-effect (nug) and two spherical (sph) structures:
γ(h) = b0 nug(h) + b1 sph(h, a1 ) + b2 sph(h, a2 )
γ (h)
1.0
0.5
short range
long range
nugget
0.0 h
0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

YXW

V
Nested Covariance Function
S
X S
X
C(h) = Cu (h) = bu ρu (h)
u=0 u=0
where ρu (h) are correlation functions.

The ρu (h) characterize the spatial correlation
at different scales of index u.
The coefficents bu represent a decomposition of

the total variance σ 2 into variances at different spatial scales:
S
X
C(0) = σ 2 = bu
u=0

YXW

V
Regionalization Model
Z(x) built up with uncorrelated components Yu (x)
of zero mean, with covariance functions Cu (h).
Example:
Z(x) = Y1 (x) + Y2 (x) + m with Y1 ⊥Y2
The covariance function of Z(x) is nested:
C(h) = C1 (h) + C2 (h)

YXW
V
Linear Model with S + 1 components
S
X
Z(x) = Yu (x) + m
u=0
with Yu ⊥Yv for u 6= v
Corresponding nested covariance model:

S
X S
X
C(h) = Cu (h) = bu ρu (h)
u=0 u=0
Can components Yu be extracted from samples Z(xα )?

Z

YXW
V
Kriging Spatial Components
A component Y1 (x) at x0 is estimated from n data:
n
X
Y1? (x0 ) = wα Z(xα )
α=1
n
X
“No bias” with wα = 0 : this filters the mean m
α=1
Minimizing the “estimation variance”:

 n
 X


 wβ1 C(xα −xβ ) − µ1 = C1 (xα −x0 ) for α= 1, . . . , n

β=1
Xn




 wβ = 0
β=1

Z
Z

YXW
V
Example: Short-range Component of As
285.
280.
275.
270.
310. 315. 320. 325. 330. 335. 340.

Z

YXW
V
Example: Long-range Component of As
285.
280.
275.
270.
310. 315. 320. 325. 330. 335. 340.

Z

YXW
V
Demographic application
fertility data

Z

W
W
YXW
V
Demographic application: fertility 1990
150
100
Mean annual fertility ’90
51°
50°
50
49°
48°
47°
46°
45°
0
44°
43° 100 500 5000 10000 25000 50000 5e+05

250 km Nb of women per "commune"
5° 4° 3° 2° 1° 0° 1° 2° 3° 4° 5° 6° 7° 8°
Communes de France
FERT500 class
Data provided by INSEE (www.insee.fr)

Z

W
W
YXW
V
Variograms: class 100-500 women / commune
110.
100. M1
D1
90.
Variogram : FERT500
80.
70.
60.
D4 50.
100. D1
D3
D2
Variogram : FERT500
40.
75.
30.
20.
50.
10.
25. 0.
0. 100. 200. 300. 400.
Distance (km)
0.
0. 100. 200. 300.
Distance (km) short range long range

Z

W
W
YXW
V
Kriging: short range effect only
Fert500 estimation
5600. >=9.96
9.18
8.4
7.62
5500. 6.84
6.06
5.28
5400. 4.5
3.72
2.94
5300. 2.16
UTM (Km)
1.38
0.6
-0.18
5200. -0.96
110.
-1.74
100. M1
D1
-2.52
5100. -3.3
90. -4.08
Variogram : FERT500
80. -4.86
-5.64
70. 5000. -6.42
60. -7.2
-7.98
50.
4900. -8.76
40. -9.54
-10.32
30. -11.1
20.
4800. -11.88
-12.66
10. -13.44
0. 4700. -14.22
0. 100. 200. 300. 400. <-15
Distance (km) -100. 0. 100.200.300.400.500.600.700.800.
short range UTM (Km)

Z

W
W
YXW
V
Kriging
FERT500long_range Fert500 estimation
>=60 5600. >=65.2

5600. 59.2188 64.1
58.4375 63
57.6562 61.9
5500. 56.875
5500. 60.8
56.0938 59.7
55.3125 58.6
5400. 54.5312 5400. 57.5
53.75 56.4
52.9688 55.3
52.1875 5300. 54.2
UTM (Km)
5300. 53.1
UTM (Km)
51.4062
50.625 52
49.8438 50.9
5200. 49.0625 5200. 49.8
48.2812 48.7
47.5 47.6
46.7188 5100. 46.5
5100. 45.9375 45.4
45.1562 44.3
44.375 43.2
5000. 43.5938 5000. 42.1
42.8125 41
42.0312 39.9
4900. 41.25 4900. 38.8
40.4688 37.7
39.6875 36.6
38.9062 35.5
4800. 38.125 4800. 34.4
37.3438 33.3
36.5625 32.2
4700. 35.7812 4700. 31.1
<35 <30
-100. 0. 100.200.300.400.500.600.700.800. -100. 0. 100.200.300.400.500.600.700.800.
UTM (Km) UTM (Km)
Long range effect Short + long range

Z

W
W
YXW
V
Kriging with external drift

Z

XYW
V
Drift
Translation-invariant drift: polynomials, trigonometric functions
External Drift: an auxiliary variable known everywhere

ZZ

XYW
V
External drift method
1. Auxiliary variable s(x) known everywhere in the domain D.
2. The relation to the variable of interest is linear:

h i
E Z(x) = b0 + b1 s(x)

ZZ
Z

XYW
V
External Drift method
n
X
Z ? (x0 ) = wα Z(xα )
α=1
h i n
X
⇒ E Z ? (x0 ) = wα b0 + b1 s(xα )
α=1

ZZ

XYW
V
Constraint: no bias
The constraint
n
X
wα = 1
α=1
has the effect that the coefficients b0 and b1

are filtered out:
h i h i
E Z ? (x0 ) = E Z(x0 )
n
X
=⇒ b 0 + b1 wα s(xα ) = b0 + b1 s(x0 )
α=1
n
X
=⇒ wα s(xα ) = s(x0 )
α=1

ZZ

XYW
V
Interpolation of external drift
This second constraint:
n
X
wα s(xα ) = s(x0 )
α=1
generates weights wα which interpolate exactly s(x).

ZZ

XYW
V
Kriging System with linear and external drift
 n
 X
 1 2


 w β C(x α −x β ) + µ 0 + µ 1 x α + µ 2 x α + µ3 s(xα ) = C(xα −x0 ), ∀α

 β=1



 n

 X

 wβ = 1



 β=1



 n
X
wβ x1β = x10 (longitude)

 β=1



 n

 X
2 2


 w β x β = x 0 (latitude)



 β=1


 Xn




 wβ s(xβ ) = s(x0 ) (external drift)

β=1

ZZ

XYW
V
Kriging temperature
with elevation as external drift
ZZ
XYW
V
Temperature Data
temperature conditions the growth of plants

Scotland (without the Shetland and Orkney Islands)

average January temperatures (1961-1980)

146 sites, all below 400 m altitude

Ben Nevis, 1344 m

at 3035 nodes of a regular grid

1272 m

Reference: Int. J. Clim., 14, 77–91, 1994
ZZ
XYW
V
January temperature vs latitude / longitude

Scatter diagrams of temperature with latitude and longitude:

there is a systematic decrease from west to east, while there is
not much trend in the north-south direction.
ZZ

XYW
V
E-W and N-S temperature variograms

The model is fitted in the direction without drift
ZZ
XYW
V
Kriging mean January temperature

Z
XYW

V

Temperature vs elevation

Coastal influence below 50m, then linear relation.

Temperatures only available below 400m.
Z
Z

XYW
V
Kriging temperature with elevation as drift

Z

XYW
V
Temperature estimates vs elevation

The estimated values above 400m are linearly extrapolated

outside the range of the data !
Z
XYW
V
Estimated external drift coefficient
n
X
b?1 = wα Z(xα )
 n α=1
 X
 1 2


 w β C(x α −x β ) + µ 0 + µ 1 x α + µ 2 x α + µ3 s(xα ) = 0 , ∀α


 β=1



 Xn




 wβ = 0

 β=1



 n
X
wβ x1β = 0 (longitude)

 β=1



 n

 X
2


 w β x β = 0 (latitude)



 β=1


 Xn




 wβ s(xβ ) = 1 (external drift)

β=1
Z
XYW
V
Estimated external drift coefficient

Z

XYW

V

Conditional simulation

V
YXW
Z

Conditional simulation vs Kriging

Z

YXW
V
Change of support
geostatistical simulation of O3

Z

YXW
V
CASE STUDY: Geostatistical simulation of O3
of realizations a lognormal random function

800 × 600 Km2

1 × 1 Km2

with a range of 50 Km

Z

YXW
V
Simulation of Ozone: 1×1 Km2 support
O3: 1x1km2
>=96
90
500.
84
78
400. 72
66
60
Km
300. 54
48
42
200. 36
30
24
100. 18
12
6
0. <0
0. 100. 200. 300. 400. 500. 600. 700.
ug/m3
Km

Z

YXW
V
O3: 10x10km2
>=96
500. 90
84
78
400. 72
66
60
Km
300. 54
48
42
200. 36
30
24
100. 18
12
6
100. 200. 300. 400. 500. 600. 700. <0
ug/m3
Km

Z
Z

YXW
V
O3: 20x20km2
>=96
500. 90
84
78
400. 72
66
60
Km
300. 54
48
42
200. 36
30
24
100. 18
12
6
100. 200. 300. 400. 500. 600. 700. <0
ug/m3
Km

Z

YXW
V
Simulation of Ozone
SUPPORT: 1x1 Km2 SUPPORT: 20x20 Km2

Nb Samples: 480000 Nb Samples: 1131
0.3 0.125
Minimum: 0.0 Minimum: 0.2
Maximum: 246.3 Maximum: 72.0
Mean: 6.7 0.100 Mean: 6.7
Std. Dev.: 10.2 Std. Dev.: 7.9
Frequencies
Frequencies
0.2
0.075
0.050
0.1
0.025
0.0 0.000
0. 100. 200. 0. 10. 20. 30. 40. 50. 60. 70.
O3 (ug/m3) O3 (ug/m3)
Increasing the support: the means are equal,

but the extremes and the variance are reduced

Z

YXW
V
Simulation of Ozone
D1 70. D1
100.
60.
D2
Variogram: O3
Variogram: O3
50.
75.
40.
50.
30.
D2
20.
25.
SUPPORT 10. SUPPORT
1x1 Km2 20x20 Km2
0. 0.
0. 50. 100. 150. 200. 0. 50. 100. 150. 200.
Distance (Km) Distance (Km)
Increasing the support:

the range increases

Z

YXW
V
Simulation of Ozone
Mean O3 over cutoff (ug/m3) black = 1x1Km

150. blue = 10x10Km
red = 20x20Km
100.
50.
0.
40. 50. 60. 70. 80. 90. 100. 110. 120.
O3 cutoff (ug/m3)

Z

YXW
V
Simulation of Ozone
Proportion above cutoff (%) 1.5

black = 1x1Km2
blue = 10x10Km2
red = 20x20Km2
1.0
0.5
0.0
40. 50. 60. 70. 80. 90. 100. 110. 120.
O3 cutoff (ug/m3)

Z

YXW
V
Change of support
concept
Z

W

YXW
V
TOPIC: The Support of a Random Function
Mining
3D
Soil pollution
v 2D
Volumes V
s
Surfaces
S
Industrial hygienics
1D
∆t T
Time intervals
Z

W

YXW
V
The Effect of Changing the Support
Distribution of samples on small volumes (cm3 ) is different from
that of model output averages over large blocks (m3 ):
frequency
blocs
samples
Z
mean
The mean of both distributions is the same,

the distribution of the block values is narrower.
Z

W

YXW
V
Neglecting the Support Effect
We are often interested in what is above a threshold:
overestimation!
threshold
Neglecting the support effect may lead to a systematic

over-estimation. . .
Z

W

YXW
V
Neglecting the Support Effect
. . . or to systematic under-estimation:
underestimation!
threshold
⇒ A good estimation method should incorporate

a change of support model.
Z
Z

W

YXW
V
Kriging of a Block average
(centered at a point in the domain)
Z

YXW
V
Estimation of a block value
Sample locations xα (dots)
in a domain D:
●
V0
●
●
● ●
●
●
●
● ●
● ●
● ●
●
●
●
We wish to estimate the spatial average Z ? for a block V0 .

YXW
V
Block Kriging
The block value Z ? (V0 ) is estimated as a weighted average of
the data values Z(xα ):
n
X n
X
Z ? (V0 ) = wα Z(xα ) with wα = 1
α=1 α=1
The optimal weights wαOK are obtained from the sytem:

 n
 X
OK


 w β γ(xα −xβ ) + µOK = γ(V0 , xα ) ∀α

β=1
n
X




 wβOK = 1
β=1
n
X
2
Kriging variance: σOK = µOK − γ(V0 , V0 ) + wαOK γ(V0 , xα )
α=1

YXW
V
Block kriging with non-point data
In applications the data can be averaged on blocks Vα .
We then use average variograms between these blocks:
Z Z
1
γ Vα , Vβ = γ(x − y) dx dy
|Vα | |Vβ |
x∈Vα y∈Vβ
This requires the knowledge of the point variogram.
vα
V0
Z

YXW
V
Change of support
risk of exceeding ozone alert level

W
W

YXW
V
Change of support
The variability of spatial or temporal data depends on the
averaging volume/interval(= the support)
Increasing support, the variability decreases
(reduction of variance, extremes...)
Observations are on point support as compared to the cells
of a numerical model.
End-users are often interested by a support of different
(intermediate) size −→ blocks
It is thus necessary to describe statistically how variability
changes as a function of support.
If the distribution is monomodal and not too asymmetrical,
an affine correction may suffice. Otherwise, non-linear
geostatistics or geostatistical simulation are needed
Applications: data aggregation, estimation of small block
statistics, downscaling. . .

W
W

YXW
V
Ozone in Paris on 17 july 1999 at 15h UTC
Airparif stations and Chimere grid
49.4°
49.2°
RUR_NO
49° RUR_NE
Mantes
Tremblay
Gennevilliers
Aubervilliers
P18
Neuilly
RUR_O Garches P7 P6
48.8° P13
Vitry RUR_E
Montgeron
48.6°
RUR_SO
Melun
48.4°
RUR_SE
48.2°
50 km
1.4° 1.6° 1.8° 2° 2.2° 2.4° 2.6° 2.8° 3° 3.2°
19 Airparif stations; 25 × 25 grid with cells of size 6×6 km2

Z

W
W

YXW
V
Air quality regulations
Two ozone thresholds refering to a support of 1 hour:
−→ Swiss alert level: 120 µg/m3

−→ European alert level: 180 µg/m3
Time support is always specified, yet regulations do not contain

any indication about the spatial support !
Suppose the air quality experts agree on the following

spatial decision support:
a block of 1 × 1 km2 size

(instead of the CHIMERE 6 × 6 km2 cell).
We need to model the point-block-cell change of support.

W
W

YXW
V
Discrete Gaussian point-block model
(due to Georges M ATHERON, 1976)
x is a point randomly located in a block v.

E Z(x) | Z(v) = Z(v),
is known as Cartier ’s relation.
For a Gaussian point anamorphosis (station data),

∞
X ϕk
Z(x) = ϕ(Y (x)) = Hk (Y (x))
k=0
k!
with Hermite polynomials Hk and coefficients ϕk ,

the block anamorphosis ϕv (Y (v)) comes as:
∞
X ϕk
ϕv (Y (v)) = E ϕ(Y (x)) | Y (v) = rk Hk (Y (v)).
k=0
k!

Z

W
W

YXW
V
Point-block-cell correlations
The Gaussian block anamorphosis is:
∞
X ϕk
ϕv (Y (v)) = rk Hk (Y (v)),
k=0
k!
with r being the point-block coefficient (0 ≤ r ≤ 1).

r can be computed from the block dispersion variance
(which is calculated from the station data variogram):
∞
X ϕ2
var(Z(v)) = var(ϕv (Y (v))) = k
r2k
k=1
k!
We get in the same way a point-cell coefficient r0.
And finally the block-cell coefficient rvV = r 0 /r.

Z
Z

W
W

YXW
V
Uniform conditioning
It consists in taking the conditional expectation of a
non-linear function of blocks knowing the cell value
containing them.
The proportion of blocks v ∈ V0 above the threshold zc

knowing the cell value Z(V0 ) is:

yc − rvV Y (V0 )
E Z(v)≥zc | Z(V0 ) = 1−G √ .
1 − rvV 2
G is the Gaussian distribution.

Z

W
W

YXW
V
Variogram of Airparif measurements
4000.
Variogram : Ozone_17JUL15H
Variogram : Ozone_17JUL15H
1000.
3000.
750.
2000.
500.
1000.
250.
0. 0.
0. 10. 20. 30. 40. 50. 60. 0. 10. 20. 30. 40. 50. 60.
Distance (Kilometer) Distance (Kilometer)
Nugget-effect + cubic model.

Sill = variance.

Z

W
W

YXW
V
Anamorphosis of Airparif measurements
r=.97 r’=.72
225. 225.
200. 200.
Ozone
Ozone
175. 175.
150. 150.
125. 125.
100. 100.
-2. -1. 0. 1. 2. -2. -1. 0. 1. 2.

Gaussian values Gaussian values
Anamorphosis of block values (r=.97)

close to the anamorphosis of point values.
Anamorphosis of cell values (r’=.72).

Z

W
W

YXW
V
Histograms
30.
Frequencies (%)
20.
10.
0.
120. 130. 140. 150. 160. 170. 180. 190. 200. 210.
Ozone
Histograms of blocks (blue) and cells (red)

on the basis of the change-of-support model.

Z

W
W

YXW
V
Proportion of values above threshold
100.
Proportion above alert level

90.
80.
70.
60.
50.
40.
30.
20.
10.
0.
120. 130. 140. 150. 160. 170. 180. 190. 200. 210.
Ozone
Proportions of blocks (blue) and cells (red).
Depending on the threshold, the difference can be important !

Z

W
W

YXW
V
Uniform conditioning by CHIMERE
UC 120: CHIMERE + Airparif stations
49.4°
49.2°
0.
49° 6
0.6
0.5
0.8
7
0.
0.
9
48.8° 0.2
0.9
0.4
0.3
0.
1
0.4
48.6°
0.3
0.1
0.
8
0.6
0.2
0.5
48.4°
0.5
0.7
0.3
0.4
48.2°
50 km
1.4° 1.6° 1.8° 2° 2.2° 2.4° 2.6° 2.8° 3° 3.2°
Exceedance probabilities for 1 × 1 km2 support

with the Swiss threshold of 120 µg/m3

Z

W
W

YXW
V
Uniform conditioning by CHIMERE
UC 180: CHIMERE + Airparif stations
49.4°
0.4
5
49.2°
0.6
0.
0.4
0.9 0.7
0.
8
49°
0.6
0.1
0.3
0.3 0.7
48.8°
0.5
0.2
0.2
0.4
48.6°
48.4° 0.1
48.2°
50 km
1.4° 1.6° 1.8° 2° 2.2° 2.4° 2.6° 2.8° 3° 3.2°
Exceedance probabilities for 1 × 1 km2 support

with the European threshold of 180 µg/m3

Z

W
W

YXW
V
Precipitation in SE Norway
geostatistical downscaling
Z

W
XYW
V

Histogram of precipitation: July 2001

W

W

V
XYW
Z
Variogram of precipitation
Z
Z

W
XYW
V
Block and cell anamorphosis
r=.7 r=.365
10×10km2 blocks NCEP cells
Z

W
XYW
V
Reconstructed histograms
10×10 km2 blocks 101×212km2 NCEP cells
Z

W
XYW
V
Proportion above threshold
10×10km2 blocks NCEP cells
A threshold of 100mm will be used
Z

W
XYW
V
Proportion blocks >100mm within NCEP cells
Z

W
XYW
V
NCEP cells and station values
Color codes: 0 < x < 75mm < x < 100mm < x < 125mm < x
Z

W
XYW
V
N 8

L R O E ) (
@ & 'A
- D 9
:/ +D P + :
C $ 5 F <=; %

N ! ?>
:
B
D <4 & *
!
& I

S @ HG @ +
; F
:
4 &,
:/ UTE C # I
8
T A J
M
A Q K -
MN
@
! +

0/.
D / I !
A 1
& L
2
(
GH D @ 43
:/ BC A

: / 5
8
I
M "D
V 8M & 6
XW "! 6
K 76 "!
YZ[\
^] @ D#
Y`a_ / I & $
A #
`b[ ^ $
\ cc ^ %
&
W^ $ &'
dY
ef`
gh
i

Bai Giang TKMT

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bai Giang TKMT

Uploaded by

Copyright:

Available Formats

=<;:

Director: Jean-Paul C HILÈS

Permanent staff: 14 research scientists

,→ this conditions an application driven research

Petroleum exploration, mining

Software products: Isatis, Heresim. . .

to geology and mining (fifties)

(re-)integrated mainstream statistics (nineties)

Variogram: description of the spatial/temporal correlation of

Kriging: optimal linear prediction method for estimating

Conditional Simulation: stochastic simulation of

The mass w(z) of the weights is

The location z where the bar, when suspended, stays in

with corresponding normed weights pα = 1/n:

gives an indication about the dispersion of the around the center

The mean value m? of data zα is equivalently,

Its square-root is called the standard deviation.

The normalized weights p(zk ) are the frequencies of the

n is the number of samples.

We call Z a random variable and z its realizations, z ∈ R.

The mathematical idealization of the cumulative histogram

The probability P (Z < z) indicates the theoretical frequency

The derivative of the probability distribution function is the

The expectation is a linear operator.

The variance σ 2 is defined as:

Alternate expression: multiplying out we get

and, as the expectation is a linear operator,

where mi and mj are the means of the random variables.

Covariance of Zi with itself:

The regression line is: z1? = a z2 + b

At the minimum the squared distance is:

The squared distance between z0 and the hyperplane is:

where v0 is the vector of covariances

has exactly one solution,

The squared distance at the minimum is:

dist2min (a) = var(z0 ) − a> v0

which is implicitly without bias:

The variance of the Simple Kriging estimate is:

Writing out the equation system:

Let Z(x) be a random variable at a location x ∈ D.

is called a Random Function.

Data does not generally allow a deterministic reconstruction of

The regionalized variable z(x) is considered as a realization

For a given data set, different realizations containing the data

We possess data about only one realization:

Objective quantities that describe the regionalized variable and

We compare sample values z at a pair of points with:

the expectation does not depend on x

the variance depends only on h

This type of stationarity is called intrinsic.

Regionalized variable Behavior at the origin

where stationarity of the first two moments of Z is assumed.

A variogram can be constructed from any covariance function:

γ(h) = C(0) − C(h)

Conversely, however, only if the variogram is bounded does a

The variogram characterizes a larger class of random functions.

γ(h) = |h|p , 0<p≤2

where θ1 , θ2 , θ3 are Euler’s angles.

185 Hydrolab Surveyor III conductivity measurements

Experimental variogram for D1=horizontal, D2=vertical.

-15.0 -12.5 -10.0 -7.5 -5.0 -2.5

-15.0 -12.5 -10.0 -7.5 -5.0 -2.5