You are on page 1of 116

Inference for Point Pattern Spatial

Statistics
N. Bert Loosmore
nhl@u.washington.edu

QERM 550
University of Washington
May 11 & 13, 2005

Inference for Point Pattern Spatial Statistics – p.1/49


Outline
Use of Point Pattern Statistics in Ecology

Inference for Point Pattern Spatial Statistics – p.2/49


Outline
Use of Point Pattern Statistics in Ecology

The Failure of the Simulation Envelope

Inference for Point Pattern Spatial Statistics – p.2/49


Outline
Use of Point Pattern Statistics in Ecology

The Failure of the Simulation Envelope

Diggle’s (1983, 2003) ‘Goodness of Fit’ Test

Inference for Point Pattern Spatial Statistics – p.2/49


Outline
Use of Point Pattern Statistics in Ecology

The Failure of the Simulation Envelope

Diggle’s (1983, 2003) ‘Goodness of Fit’ Test

Unresolved Implementation Issues

Inference for Point Pattern Spatial Statistics – p.2/49


Outline
Use of Point Pattern Statistics in Ecology

The Failure of the Simulation Envelope

Diggle’s (1983, 2003) ‘Goodness of Fit’ Test

Unresolved Implementation Issues

Parameterization Based on the Ecological Research


Question

Inference for Point Pattern Spatial Statistics – p.2/49


Outline
Use of Point Pattern Statistics in Ecology

The Failure of the Simulation Envelope

Diggle’s (1983, 2003) ‘Goodness of Fit’ Test

Unresolved Implementation Issues

Parameterization Based on the Ecological Research


Question

Characterizing Type I, II Error Rate Performance

Inference for Point Pattern Spatial Statistics – p.2/49


Point Pattern Statistics in Ecology
Spatial processes Ecological processes
200
150
Northing(m)
Northing

100
50
0

0 50 100 150 200

Easting
Easting(m)
Inference for Point Pattern Spatial Statistics – p.3/49
Point Pattern Statistics in Ecology
Spatial processes Ecological processes
200
150
Northing(m)
Northing

100

What pattern for the


green points?
50
0

0 50 100 150 200

Easting
Easting(m)
Inference for Point Pattern Spatial Statistics – p.3/49
Point Pattern Statistics in Ecology
Spatial processes Ecological processes
200
150
Northing(m)
Northing

100

What pattern for the


red points?
50
0

0 50 100 150 200

Easting
Easting(m)
Inference for Point Pattern Spatial Statistics – p.3/49
Point Pattern Statistics in Ecology
Spatial processes Ecological processes
200
150
Northing(m)
Northing

100

Do we see (or expect)


stationarity?
50
0

0 50 100 150 200

Easting
Easting(m)
Inference for Point Pattern Spatial Statistics – p.3/49
Point Pattern Spatial Stats: How?
Evaluate observed pattern against ideas of

Inference for Point Pattern Spatial Statistics – p.4/49


Point Pattern Spatial Stats: How?
Evaluate observed pattern against ideas of
aggregation,

rMatClust() with 105 points, radius = 0.1

Inference for Point Pattern Spatial Statistics – p.4/49


Point Pattern Spatial Stats: How?
Evaluate observed pattern against ideas of
aggregation,
CSR,
CSR pattern with 100 points

Inference for Point Pattern Spatial Statistics – p.4/49


Point Pattern Spatial Stats: How?
Evaluate observed pattern against ideas of
aggregation,
CSR,
inhibition rSSI() with 100 points, radius = 0.05

Inference for Point Pattern Spatial Statistics – p.4/49


Point Pattern Spatial Stats: How?
Evaluate observed pattern against ideas of
aggregation,
CSR,
inhibition

Analyze distances between events:

Inference for Point Pattern Spatial Statistics – p.4/49


Point Pattern Spatial Stats: How?
Evaluate observed pattern against ideas of
aggregation,
CSR,
inhibition

Analyze distances between events:


G (nearest neighbor),

Inference for Point Pattern Spatial Statistics – p.4/49


Point Pattern Spatial Stats: How?
Evaluate observed pattern against ideas of
aggregation,
CSR,
inhibition

Analyze distances between events:


G (nearest neighbor),
F(grid to nearest point),

Inference for Point Pattern Spatial Statistics – p.4/49


Point Pattern Spatial Stats: How?
Evaluate observed pattern against ideas of
aggregation,
CSR,
inhibition

Analyze distances between events:


G (nearest neighbor),
F(grid to nearest point),
K/L (all neighbors)

Inference for Point Pattern Spatial Statistics – p.4/49


Point Pattern Spatial Stats: How?
Evaluate observed pattern against ideas of
aggregation,
CSR,
inhibition

Analyze distances between events:


G (nearest neighbor),
F(grid to nearest point),
K/L (all neighbors)

Typically perform analysis using ‘Simulation Envelope’

Inference for Point Pattern Spatial Statistics – p.4/49


Definition of the G and F Statistics
G statistic uses the nearest neighbor distances ( ) for each


of sample points as:









F statistic uses the distances ( ) from each of sample


points (typically located on a grid) to their nearest event as:






Under CSR, both the G and F statistic is approximated as






 








Inference for Point Pattern Spatial Statistics – p.5/49


Definition of the K and L Statistics

K statistic uses the distances between all neighbors ( ) as:


















 




Under CSR, K statistic can be approximated by










L statistic used to set mean
and (supposedly) stabilize


variance as:









Inference for Point Pattern Spatial Statistics – p.6/49


Building the Simulation Envelope
A CSR pattern with


1.0
0.8
0.6
G(t)




0.4
0.2
0.0

0.00 0.05 0.10 0.15 0.20

distance
Distance

Inference for Point Pattern Spatial Statistics – p.7/49


Building the Simulation Envelope
99 CSR patterns with


1.0
0.8
0.6
G(t)




0.4
0.2
0.0

0.00 0.05 0.10 0.15 0.20

distance
Distance

Inference for Point Pattern Spatial Statistics – p.7/49


Using the Simulation Envelope



Plot after subtracting



0.3
0.2

rSSI(r=0.03, n=100)
0.1
hat G−bar G








0.0


−0.1
−0.2
−0.3

0.00 0.05 0.10 0.15 0.20

Distance
Distance
Inference for Point Pattern Spatial Statistics – p.8/49
Perceived Level Performance










Using all results from 19 simulations yields , or







Throwing out upper and lower 2 simulations at each
distance ( ) from 99 simulations also yields










Inference for Point Pattern Spatial Statistics – p.9/49


Kenkel (1988) Methods
Evaluated spatial locations of all live trees, all (live +
standing dead) trees in a jack pine Pinus Bansiana forest.

Inference for Point Pattern Spatial Statistics – p.10/49


Kenkel (1988) Methods
Evaluated spatial locations of all live trees, all (live +
standing dead) trees in a jack pine Pinus Bansiana forest.

Map of live + standing dead represents distribution


following early sapling mortality, but prior to the onset of
density-depending mortality.

Inference for Point Pattern Spatial Statistics – p.10/49


Kenkel (1988) Methods
Evaluated spatial locations of all live trees, all (live +
standing dead) trees in a jack pine Pinus Bansiana forest.

Map of live + standing dead represents distribution


following early sapling mortality, but prior to the onset of
density-depending mortality.

Methods: Used MC techniques for the G and L statistics


to evaluate observed results against of i) random


locations (CSR) and ii) random mortality.

Inference for Point Pattern Spatial Statistics – p.10/49


Kenkel (1988) Conclusions
G: live + dead shows no departure from randomness
whereas live trees only shows significant regularity

Inference for Point Pattern Spatial Statistics – p.11/49


Kenkel (1988) Conclusions
G: live + dead shows no departure from randomness
whereas live trees only shows significant regularity

L: live + dead shows no departure from CSR at small


scales, live trees show regularity at smaller scales

Inference for Point Pattern Spatial Statistics – p.11/49


Kenkel (1988) Conclusions
G: live + dead shows no departure from randomness
whereas live trees only shows significant regularity

L: live + dead shows no departure from CSR at small


scales, live trees show regularity at smaller scales

But is this interpretation correct?

Inference for Point Pattern Spatial Statistics – p.11/49


Examples in Ecological Research
Author (Year) Statistics Patterns in “CI” (%) Marginal
Used Sim Env (s) Results (y/n)
Batista and Maguire (1998) G, K 19 95% n
Dolezal et al. (2004) K 99 95% y
Freeman and Ford (2002) G, K 99 99% n
Grassi et al. (2004) K 99 95% n
Hirayama and Sakimoto (2003) K 19,99 95%, 99% n
Martens et al. (1997) L 99 95% n
Moeur (1997) G, K 200 90% n
Parish et al. (1999) G, K 19 95% n
Salvador-
Van Eysenrode et al. (2000) G, K 1000 95% y
Srutek et al. (2002) L 99 95% y
Tirado and Pugnaire (2003) K 1000 99% n

Inference for Point Pattern Spatial Statistics – p.12/49


Outline
Use of Point Pattern Statistics in Ecology

The Failure of the Simulation Envelope

Diggle’s (1983, 2003) ‘Goodness of Fit’ Test

Unresolved Implementation Issues

Parameterization Based on the Ecological Research


Question

Characterizing Type I, II Error Rate Performance

Inference for Point Pattern Spatial Statistics – p.13/49


Sim Env Level Performance
Simulation study with independent ‘trials’ of a CSR
pattern against a CSR envelope.

Designate ‘failure’ if pattern exceeds envelope at any


distance. (Type I error)

Expected type I error rate 0.05 ...

Inference for Point Pattern Spatial Statistics – p.14/49


Sim Env Level Performance
Simulation study with independent ‘trials’ of a CSR
pattern against a CSR envelope.

Designate ‘failure’ if pattern exceeds envelope at any


distance. (Type I error)

Expected type I error rate 0.05 ...


... actual type I error rate 0.5-0.7

Inference for Point Pattern Spatial Statistics – p.14/49


Monte Carlo Simulation Theory
For a univariate continuous distribution,




















Inference for Point Pattern Spatial Statistics – p.15/49


Monte Carlo Simulation Theory
For a univariate continuous distribution,



















But does the simulation envelope comprise a univariate
distribution?

Inference for Point Pattern Spatial Statistics – p.15/49


How the Envelope is Really Made
Simulation envelope built from 100 patterns:
0.3
0.2

55 patterns comprising
the simulation envelope
0.1
G−G





^ 


0.0


−0.1
−0.2
−0.3

0.00 0.05 0.10 0.15 0.20 0.25

distance
Distance

Inference for Point Pattern Spatial Statistics – p.16/49


Failure of the Simulation Envelope
Although built from patterns, complexity of both


1. G, F, and/or K statistics, and
2. spatial patterns

yields a multivariate result.

Since evaluation of the observed pattern occurs at many


distances we are performing simultaneous inference and
thus is increased.


Further, if the simulation envelope is invalid, then how can


we use it to determine scale?

Inference for Point Pattern Spatial Statistics – p.17/49


Outline
Use of Point Pattern Statistics in Ecology

The Failure of the Simulation Envelope

Diggle’s (1983, 2003) ‘Goodness of Fit’ Test

Unresolved Implementation Issues

Parameterization Based on the Ecological Research


Question

Characterizing Type I, II Error Rate Performance

Inference for Point Pattern Spatial Statistics – p.18/49


Proper Statistical Methods
From Diggle (1983, 2003), for a given :


1. At a single a priori distance - use upper and lower
simulated values

2. Across a range of distances - use Goodness of Fit test

Inference for Point Pattern Spatial Statistics – p.19/49


The Goodness of Fit Test - 1
1. Represent the empirical results as:
 



observed pattern, and




  



for simulated patterns














Inference for Point Pattern Spatial Statistics – p.20/49


The Goodness of Fit Test - 2
2. Calculate:

  




















for









Summary statistic indicative of the total deviation of the


given pattern from the theoretical result

Inference for Point Pattern Spatial Statistics – p.21/49


The Goodness of Fit Test - 2
2. Calculate:

  













 






for









but use




  

  








 

















to reduce bias

Inference for Point Pattern Spatial Statistics – p.21/49


The Goodness of Fit Test - 3
3. Reject (fail to) based on the rank of using the


p-value, calculated as
































for . So, if (the largest), then


















Now we have quantitative results to evaluate a pattern’s


significance based on an “exact” level test because of


proper MC methods
Inference for Point Pattern Spatial Statistics – p.22/49
Outline
Use of Point Pattern Statistics in Ecology

The Failure of the Simulation Envelope

Diggle’s (1983, 2003) ‘Goodness of Fit’ Test

Unresolved Implementation Issues

Parameterization Based on the Ecological Research


Question

Characterizing Type I, II Error Rate Performance

Inference for Point Pattern Spatial Statistics – p.23/49


Unresolved Implementation Issues
What is the optimal method to calculate ?


  










 









How to:

replace integration with summation


incorporate edge correction methods






choose limits , distance list





simulate patterns from null process

Inference for Point Pattern Spatial Statistics – p.24/49


Replacing Integration with Summation
We can rewrite Eqn (1) as

  










 









  













 





But how accurate is this approximation?

Inference for Point Pattern Spatial Statistics – p.25/49


Edge Correction
Used to eliminate bias from edge interfering with detecting
a point’s neighbor

Reduced Sample edge correction approach:


Let be the distance for point to the closest






boundary
Remove point from calculation at distance where








Other approaches (toroidal, isotropic, etc.)

Inference for Point Pattern Spatial Statistics – p.26/49



Choice of Limits ( ), Distance List ( )




Recommended default for , but application








dependent!

Inference for Point Pattern Spatial Statistics – p.27/49



Choice of Limits ( ), Distance List ( )




Recommended default for , but application








dependent!







,  are discrete, change where



Inference for Point Pattern Spatial Statistics – p.27/49



Choice of Limits ( ), Distance List ( )




Recommended default for , but application








dependent!







, 
are discrete, change where



new neighbor detected, or

Inference for Point Pattern Spatial Statistics – p.27/49



Choice of Limits ( ), Distance List ( )




Recommended default for , but application








dependent!







, 
are discrete, change where



new neighbor detected, or


point removed from sample

Inference for Point Pattern Spatial Statistics – p.27/49



Choice of Limits ( ), Distance List ( )




Recommended default for , but application








dependent!







, 
are discrete, change where



new neighbor detected, or


point removed from sample

Use empirical distance list for exact results from a single


pattern

Inference for Point Pattern Spatial Statistics – p.27/49



Choice of Limits ( ), Distance List ( )




Recommended default for , but application








dependent!







, 
are discrete, change where



new neighbor detected, or


point removed from sample

Use empirical distance list for exact results from a single


pattern



Because of calculation, especially , for exact

 




solution, need to use complete empirical distance list (i.e.


from all patterns) for evaluation of each pattern

Inference for Point Pattern Spatial Statistics – p.27/49


Resolution of Simulated Patterns
Complexity? - Number of distances grows with ,


Inference for Point Pattern Spatial Statistics – p.28/49
Resolution of Simulated Patterns
Complexity? - Number of distances grows with ,


Resolution (i.e. vs ) of simulated








patterns should be equivalent to that of observed
pattern

Inference for Point Pattern Spatial Statistics – p.28/49


Resolution of Simulated Patterns
Complexity? - Number of distances grows with ,


Resolution (i.e. vs ) of simulated








patterns should be equivalent to that of observed
pattern
Limiting resolution helps constrain complexity

Inference for Point Pattern Spatial Statistics – p.28/49


Resolution of Simulated Patterns
Complexity? - Number of distances grows with ,


Resolution (i.e. vs ) of simulated








patterns should be equivalent to that of observed
pattern
Limiting resolution helps constrain complexity

is highly accurate for ecological







data (Freeman and Ford, 2002)

Inference for Point Pattern Spatial Statistics – p.28/49


Resolution of Simulated Patterns
Complexity? - Number of distances grows with ,


Resolution (i.e. vs ) of simulated








patterns should be equivalent to that of observed
pattern
Limiting resolution helps constrain complexity

is highly accurate for ecological








data (Freeman and Ford, 2002)

Combining resolution and default leads to at most







25,000 distances in , regardless of , or test statistic, and





provides an exact solution

Inference for Point Pattern Spatial Statistics – p.28/49


Outline
Use of Point Pattern Statistics in Ecology

The Failure of the Simulation Envelope

Diggle’s (1983, 2003) ‘Goodness of Fit’ Test

Unresolved Implementation Issues

Parameterization Based on the Ecological Research


Question

Characterizing Type I, II Error Rate Performance

Inference for Point Pattern Spatial Statistics – p.29/49


Parameterization - 1
“How to run any given test based on the ecological
research question”

Number of simulations ( )

Inference for Point Pattern Spatial Statistics – p.30/49


Parameterization - 1
“How to run any given test based on the ecological
research question”

Number of simulations ( )


Choice of , including choice of




Inference for Point Pattern Spatial Statistics – p.30/49



versus




Uncertainly in realized p-value ( ) results from the use of

 

MC simulations

Ramifications of ? Affects precision of through

 



actual simulated patterns against which observed
pattern tested, and
number of those patterns

Note about exact level performance (across many tests




vs. variation of p-value for single test)

Inference for Point Pattern Spatial Statistics – p.31/49



Distribution of


Let and for . The p-value for












the test is then:











Inference for Point Pattern Spatial Statistics – p.32/49



Distribution of


Let and for . The p-value for












the test is then:












The expected value of P is:


















Assuming Y comes from , then . So,







each of the 







 



Inference for Point Pattern Spatial Statistics – p.32/49


Variance of P ( )



Looking at the variance of we have





















































Inference for Point Pattern Spatial Statistics – p.33/49


Variance of P ( )



Looking at the variance of we have






















































Hence we can model the theoretical distribution of as



from a binomial(p,s) distribution.
Inference for Point Pattern Spatial Statistics – p.33/49
Managing Uncertainty in

 
Rem that binomial quickly converges to Normal

Inference for Point Pattern Spatial Statistics – p.34/49


Managing Uncertainty in

 
Rem that binomial quickly converges to Normal
Create 95% CI on (true p-value) near as









 







Inference for Point Pattern Spatial Statistics – p.34/49


Managing Uncertainty in

 
Rem that binomial quickly converges to Normal
Create 95% CI on (true p-value) near as









 








95% of CI created this way should contain the true
value of , and so set decision rule: e.g. reject if





CI contains or fully below 0.05

Inference for Point Pattern Spatial Statistics – p.34/49


Managing Uncertainty in

 
Rem that binomial quickly converges to Normal
Create 95% CI on (true p-value) near as









 








95% of CI created this way should contain the true
value of , and so set decision rule: e.g. reject if





CI contains or fully below 0.05


Choose acceptable range of uncertainty for .



Inference for Point Pattern Spatial Statistics – p.34/49
Managing Uncertainty in

 
Rem that binomial quickly converges to Normal
Create 95% CI on (true p-value) near as









 








95% of CI created this way should contain the true
value of , and so set decision rule: e.g. reject if





CI contains or fully below 0.05


Choose acceptable range of uncertainty for . For







example if is ok, use









Inference for Point Pattern Spatial Statistics – p.34/49
Managing Uncertainty in

 
Rem that binomial quickly converges to Normal
Create 95% CI on (true p-value) near as









 








95% of CI created this way should contain the true
value of , and so set decision rule: e.g. reject if





CI contains or fully below 0.05


Choose acceptable range of uncertainty for . For







example if is ok, use










Use relationship between and to find value of





Inference for Point Pattern Spatial Statistics – p.34/49


as a function of


0.07
0.06
0.05
0.04


σp


0.03
0.02
0.01

0 500 1000 1500 2000

Number of Simulations (s)


# of Simulations

Inference for Point Pattern Spatial Statistics – p.35/49


Choice of
Use all available ecological knowledge for a more
informative test

Inference for Point Pattern Spatial Statistics – p.36/49


Choice of
Use all available ecological knowledge for a more
informative test

Null point process just needs to be able to be simulated,


many models available (e.g. spatstat) or write your
own!

Inference for Point Pattern Spatial Statistics – p.36/49


Choice of
Use all available ecological knowledge for a more
informative test

Null point process just needs to be able to be simulated,


many models available (e.g. spatstat) or write your
own!

At the very least, choose simple inhibition model based


on physical separation

Inference for Point Pattern Spatial Statistics – p.36/49


Choice of
Use all available ecological knowledge for a more
informative test

Null point process just needs to be able to be simulated,


many models available (e.g. spatstat) or write your
own!

At the very least, choose simple inhibition model based


on physical separation

EDA vs. confirmatory analysis, results in iterative


nature of research, with (hopefully) tests on
independent data sets

Inference for Point Pattern Spatial Statistics – p.36/49


Choice of
Use all available ecological knowledge for a more
informative test

Null point process just needs to be able to be simulated,


many models available (e.g. spatstat) or write your
own!

At the very least, choose simple inhibition model based


on physical separation

EDA vs. confirmatory analysis, results in iterative


nature of research, with (hopefully) tests on
independent data sets

Use the model to determine information on scale!


Inference for Point Pattern Spatial Statistics – p.36/49
Example of model fitting
Attempt to fit a clustered model, representing
establishment processes to the lower SW quadrant of
the WRCCRF data, for all trees in height.

Inference for Point Pattern Spatial Statistics – p.37/49


Example of model fitting
Attempt to fit a clustered model, representing
establishment processes to the lower SW quadrant of
the WRCCRF data, for all trees in height.


Used Poisson Clustered model, with represents the
number of parents and represents the expected


number of children per parent, and where clustering of
‘children’ around each parent are described as


















 

Inference for Point Pattern Spatial Statistics – p.37/49


Example of model fitting
Attempt to fit a clustered model, representing
establishment processes to the lower SW quadrant of
the WRCCRF data, for all trees in height.


Used Poisson Clustered model, with represents the
number of parents and represents the expected


number of children per parent, and where clustering of
‘children’ around each parent are described as


















 

How to choose values for and ? ( )










Inference for Point Pattern Spatial Statistics – p.37/49


Example of model fitting
Attempt to fit a clustered model, representing
establishment processes to the lower SW quadrant of
the WRCCRF data, for all trees in height.


Used Poisson Clustered model, with represents the
number of parents and represents the expected


number of children per parent, and where clustering of
‘children’ around each parent are described as


















 

How to choose values for and ? ( )










Note that my null ‘model’ here describes not only the


process, but also the parameter values.

Inference for Point Pattern Spatial Statistics – p.37/49


Example of model fitting - 2
This is Exploratory Data Analysis!

Inference for Point Pattern Spatial Statistics – p.38/49


Example of model fitting - 2
This is Exploratory Data Analysis!
If we knew the theoretical value of G, K for this model,
use Diggle’s ‘Least Squares Estimation’ method

Inference for Point Pattern Spatial Statistics – p.38/49


Example of model fitting - 2
This is Exploratory Data Analysis!
If we knew the theoretical value of G, K for this model,
use Diggle’s ‘Least Squares Estimation’ method
Otherwise, use GoF test to estimate parameter space

Inference for Point Pattern Spatial Statistics – p.38/49


Example of model fitting - 2
This is Exploratory Data Analysis!
If we knew the theoretical value of G, K for this model,
use Diggle’s ‘Least Squares Estimation’ method
Otherwise, use GoF test to estimate parameter space
Find for different combinations of and ‘accept’




model where








Inference for Point Pattern Spatial Statistics – p.38/49


Example of model fitting - 2
This is Exploratory Data Analysis!
If we knew the theoretical value of G, K for this model,
use Diggle’s ‘Least Squares Estimation’ method
Otherwise, use GoF test to estimate parameter space
a) G statistic b) K statistic

0.4

0.4
0.3

0.3
σ σ
0.2

0.2
0.1

0.1
0 20 40 60 80 100 0 20 40 60 80 100
ρ ρ
Inference for Point Pattern Spatial Statistics – p.38/49
Example of model fitting - 3
Inference? For the observed data, if this model fits, then
larger suggests lower (i.e. few parents) and so more

children/parent.

Inference for Point Pattern Spatial Statistics – p.39/49


Example of model fitting - 3
Inference? For the observed data, if this model fits, then
larger suggests lower (i.e. few parents) and so more

children/parent.
Conversely a smaller clustering radius requires higher
and so fewer children per parent.

Inference for Point Pattern Spatial Statistics – p.39/49


Example of model fitting - 3
Inference? For the observed data, if this model fits, then
larger suggests lower (i.e. few parents) and so more

children/parent.
Conversely a smaller clustering radius requires higher
and so fewer children per parent.
Is this model a good fit? What might the physiological
and/or ecological implications be?

Inference for Point Pattern Spatial Statistics – p.39/49


Example of model fitting - 3
Inference? For the observed data, if this model fits, then
larger suggests lower (i.e. few parents) and so more

children/parent.
Conversely a smaller clustering radius requires higher
and so fewer children per parent.
Is this model a good fit? What might the physiological
and/or ecological implications be?
gives us hints about scale.


Inference for Point Pattern Spatial Statistics – p.39/49


, Variance stabilization




should be chosen before the test, and based on



research question. (i.e. what is the interaction distance of


interest?)

Inference for Point Pattern Spatial Statistics – p.40/49


, Variance stabilization




should be chosen before the test, and based on



research question. (i.e. what is the interaction distance of


interest?)

0.05
0.00
K(t)


 



 









−0.05
−0.10

0.00 0.05 0.10 0.15 0.20

Distance
distance

Variance stabilization - to make variance independent of .


Inference for Point Pattern Spatial Statistics – p.40/49
Outline
Use of Point Pattern Statistics in Ecology

The Failure of the Simulation Envelope

Diggle’s (1983, 2003) ‘Goodness of Fit’ Test

Unresolved Implementation Issues

Parameterization Based on the Ecological Research


Question

Characterizing Type I, II Error Rate Performance

Inference for Point Pattern Spatial Statistics – p.41/49


Type I Error Rate ( ) - 1
Simulation study of Type I error rate performance

Evaluated different levels, for different point pattern


intensities (

)



Results within LRT boundaries

Inference for Point Pattern Spatial Statistics – p.42/49


Type I Error Rate ( ) - 2
Simulations of 1000 independent trials using





a) Type I error rates for G b) Type I error rates for K
0.15

0.15
0.10

0.10
^
α ^
α
0.05

0.05
0.00

0.00

0 50 100 150 200 250 0 50 100 150 200 250

λ λ
# points ( ) # points ( )
Inference for Point Pattern Spatial Statistics – p.43/49
Type II Error Rate (1-Power)
Type II error rate is the prob of accepting given that


is really true.

Inference for Point Pattern Spatial Statistics – p.44/49


Type II Error Rate (1-Power)
Type II error rate is the prob of accepting given that


is really true.
Requires definition of .

Inference for Point Pattern Spatial Statistics – p.44/49


Type II Error Rate (1-Power)
Type II error rate is the prob of accepting given that


is really true.
Requires definition of .
Power will be a function of ‘how far’ is from .


(‘Easy’ to think of this distance when using Normal
distribution, but more difficult to conceptualize here.)

Inference for Point Pattern Spatial Statistics – p.44/49


Type II Error Rate (1-Power)
Type II error rate is the prob of accepting given that


is really true.
Requires definition of .
Power will be a function of ‘how far’ is from .


(‘Easy’ to think of this distance when using Normal
distribution, but more difficult to conceptualize here.)
Often overlooked for spatial point process analysis, but
can be simulated.

Inference for Point Pattern Spatial Statistics – p.44/49


Analysis of Type II Error Rate
Analysis of power against of CSR for WRCCRF
example for different parameterizations of .


Type II error rate tells us the ability to distinguish the
pattern from CSR.
As increases, larger clusters are more like CSR.


a)ρ=20 b)ρ=40
1.0

1.0
0.8

0.8
0.6

0.6
Power

Power
0.4

0.4
0.2

0.2
0.0

0.0

0.05 0.15 0.25 0.35 0.05 0.15 0.25 0.35

σ σ

Inference for Point Pattern Spatial Statistics – p.45/49


Power of the G Statistic
‘Large’ deviation at small distances may be swamped out

0.3
0.2
0.1
G−G






^ 


0.0


−0.1
−0.2

rSSI(r=0.02)
rSSI(r=0.03)
−0.3

0.00 0.05 0.10 0.15 0.20

distance
Distance

Inference for Point Pattern Spatial Statistics – p.46/49


Parameters that may improve Power
Rewriting Equation (2) in its ‘full’ form (Diggle, 2003):






  












 











Inference for Point Pattern Spatial Statistics – p.47/49


Parameters that may improve Power
Rewriting Equation (2) in its ‘full’ form (Diggle, 2003):






  












 













, as parameters to improve Power against certain





Inference for Point Pattern Spatial Statistics – p.47/49


Parameters that may improve Power
Rewriting Equation (2) in its ‘full’ form (Diggle, 2003):






   












 













Use of not well explored, but could be used to





emphasize certain distances.





For my calculations,



Inference for Point Pattern Spatial Statistics – p.47/49


Parameters that may improve Power
Rewriting Equation (2) in its ‘full’ form (Diggle, 2003):






  













 













For ,



use for L statistic.






use for power against clustered patterns






(Diggle, 2003)
other?

Inference for Point Pattern Spatial Statistics – p.47/49


Conclusions
Simulation envelope does not result in expected Type I
error rates. Limits are not confidence intervals.

Inference for Point Pattern Spatial Statistics – p.48/49


Conclusions
Simulation envelope does not result in expected Type I
error rates. Limits are not confidence intervals.
For more precise, reliable results, implement Diggle’s
goodness of fit test

Inference for Point Pattern Spatial Statistics – p.48/49


Conclusions
Simulation envelope does not result in expected Type I
error rates. Limits are not confidence intervals.
For more precise, reliable results, implement Diggle’s
goodness of fit test
Previous marginal results should be re-examined

Inference for Point Pattern Spatial Statistics – p.48/49


Conclusions
Simulation envelope does not result in expected Type I
error rates. Limits are not confidence intervals.
For more precise, reliable results, implement Diggle’s
goodness of fit test
Previous marginal results should be re-examined
Choice of , based on research question and



previous knowledge

Inference for Point Pattern Spatial Statistics – p.48/49


Conclusions
Simulation envelope does not result in expected Type I
error rates. Limits are not confidence intervals.
For more precise, reliable results, implement Diggle’s
goodness of fit test
Previous marginal results should be re-examined
Choice of , based on research question and



previous knowledge
Evaluate the Power of your test

Inference for Point Pattern Spatial Statistics – p.48/49


Conclusions
Simulation envelope does not result in expected Type I
error rates. Limits are not confidence intervals.
For more precise, reliable results, implement Diggle’s
goodness of fit test
Previous marginal results should be re-examined
Choice of , based on research question and



previous knowledge
Evaluate the Power of your test

R software availability:
http://students.washington.edu/nhl/masters.html

Inference for Point Pattern Spatial Statistics – p.48/49


R software resources
CRAN (Comprehensive R Archive Network) site
http://cran.r-project.org/

A. Baddeley’s spatstat package


http://www.maths.uwa.edu.au/ adrian/spatstat.html

P. Diggle’s splancs package


http://www.maths.lancs.ac.uk/ rowlings/Splancs/

UW R and S-plus user support group


http://mailman1.u.washington.edu/mailman/listinfo/s plus

Inference for Point Pattern Spatial Statistics – p.49/49

You might also like