You are on page 1of 29

Dr.

Syartinilia Wijaya
M.K DASAR SIG DAN LINGKUNGAN

Todays outline:
What are species ditribution models?
Why species distribution models?
How is GIS use in modelling species distributions?
Modelling Aprroaches
Logistic regression
Autologistic regression

What are species distribu5on model?


Empirical models that relate eld observation of a species to

environmental predictors based on statistically or theoretically


derived response surface (Guisan and Zimmermann, 2000)


A model which uses a species observed distribution and/or
biological characteristics to predict its actual (or potential)
distribution
Observed range within which species has been sighted
Actual species current distribution
Potential range within which the species could be found

Term is widespread but somewhat misleading since it is actually the
distribution of suitable environments that is being modeled, rather
than the species distribution itself

Also known as ecological niche, environmental niche, habitat
suitability and bioclimate envelope modeling

Why model species distribu5ons?



Important component of conservation planning
Improve our understanding of species-habitat

relationships in space and time


Predict patterns of biodiversity
Identify areas of conservation signicance
Predict species invasions and identify areas at risk
Identify suitable areas for reintroducing species
Locating study sites


GIS and species distribu5on
modeling: the perfect match
With the growing availability of digitised spatial data, GIS

have received growing interest from conservation


biologists

Developed from tools primarily designed for spatial data

management and cartography into sophisticated decision


support systems that utilise spatial and tabular analysis to
derive new information

Carry out additional processing of model output, such as

removing predicted areas that are isolated from observed


species records by a dispersal barrier

Modeling approaches

Correlative combine known occurrence records with

digital layers of environmental variables


Deductive use a species biological and ecological

requirements to generate predictions regarding its


suitable habitat

Correla5ve approach

Assumes observed distribution of a species provides

useful information as to the environmental


requirements of that species
Suitable when spatially explicit occurrence records are
available
Most common approach as occurrence records are
available for a large number of species

Building a correla5ve species


distribu5on model
Map Observed
species distribution

Generate
environmental
layers

Apply
modelling
algorithm

Test predictive
performance

Model
calibration

Map predicted
distribution

Deduc5ve approach

Use deduction a logical method for identifying

specic consequences originating from a known set of


facts


Suitable when biology of species is known, but few

observations in the wild

Building a deduc5ve species


distribu5on model
Generate predictions
regarding species
suitable habitat
Model calibration
Generate
environmental
layers

Test predictive
performance

Map predicted
distribution

Problems with deduc5ve approach



Do the variables chosen actually inuence the

distribution of the species directly?


More intense validation of models required
Higher probability that major changes to the model
will be required
Few observations also means few records for
validation of models
Over/underestimation of actual distribution

Logis5c regression
Logistic regression (LR) is a model used for prediction

of the probability of occurrence of an event and the


dependent variable is dichotomous , such as
presence(1) / absence(0).

Pi =

1
k

1 + exp 0 + j x ji
j =1

NEST-SITE PRESENCE DATA

Gunung Gede Pangrango


National Park

Pseudo-Absence
Reliable information on absence is more easily achieved with

plant census (Zaniewski et al., 2002; Ottaviani et al., 2004),


while the detection of animal species often can be inuenced
by elusive behavior, poorly accessible habitat, and activity
patterns (Boone and Krohn, 1999; Zielinski and Stauer,
1996; Ottaviani et al., 2004).

When only absence data or only presence data are available,

it is dicult to incorporate these into a robust statistical


evaluation procedure.

Therefore many studies have sought to apply presence-

absence techniques to presence-only data by generating


pseudo-absence data from background areas from which
species data are missing

The manner in which pseudo-absences are generated is

particularly important because it can have a signicant


inuence on the nal quality of the model (Zaniewski et
al., 2002; Engler et al., 2004).

The easiest way to choose pseudo-absences is simply to

generate them totally at random over the study area


(Hirzel et al., 2001; Zaniewski et al., 2002)
risk of generating an absence in an area that is, in fact,
favourable to the species.

Indeed, when dealing with common species, choosing

such a wrong absence may not be too problematic


because the numerous presence records will counteract its
eect.

However, when working with endangered species, data are

often scarce, and choosing a wrong absence could


signicantly reduce the quality of a model.

To avoid, or at least reduce, this problem, more subtle

methods can be employed to generate the pseudo-absence


data.

Pseudo-absence data created

Autologis5c regression
The classic statistical methods used to analyze species

environment relationships assume the sample values of


the response and explanatory variables to be statistically
independent.


However, spatial autocorrelation is frequently

encountered in ecological data, and many ecological


theories and models implicitly assume an underlying
spatial pattern in the distribution of organisms and their
environment (Legendre and Fortin, 1989).


Spatial autocorrelation is when the value at any one point

in space is dependent on values at the surrounding points.

Typically, species abundances are positively autocorrelated

such that nearby points in space tend to have more similar


values than would be expected by random chance.

Spatial autocorrelation that cannot be modeled satisfactorily

by environmental covariates. Thus, models that ignore the


spatial autocorrelation may be inappropriate because they
might overestimate the importance of environmental
variables (Lichstein et al., 2002).

In addition, such models could include variables that have

little or no relevance to the response variable (Betts et al.,


2006), creating false ecological conclusions in modeling
spatial distribution of species (De Frutos et al., 2006).

This problem could be solved by incorporating spatial

autocorrelation (autocovariate) into logistic regression


models, which would result in model improvements such as
increased predictive accuracy and model versatility
(Legendre, 1993; Augustin et al., 1996; Koutsias, 2003; Betts et
al., 2006; Piorecky and Prescott, 2006).

Autologistic regression (ALR) is a model account for spatial

autocorrelation through the addition of an autocovariate


variable.
Pi =

1
k

1 + exp 0 + j x ji + cauto covi


j =1

Autocovariate (AUTOCOV) can be estimated from

predicted probabilities of occupation estimated by an


ordinary logistic regression model
k

ij

auto covi =

pj

j =1
k

ij


wij being the inverse of the Euclidean distance between i and j, while pj
j =1

represents the predicted probability estimated by LR.

Autocovariate is weighted average of the number of

occupied pixels amongst a set of neighbors of the focal


pixel (Augustin et al., 1996).

Neighborhood sizes used for


autocovariate calcula5on
1500m

In image-processing
terms, the
autologis5c term
acts as a smoothing
lter, removing
isolated pixels and
consolida5ng habitat
patches dened as
suitable (Osborne et
al., 2001).

1350m
1050m
750m
450m

Focal pixel

AUTOCOV_15
AUTOCOV_25
AUTOCOV_35
AUTOCOV_45
AUTOCOV_50

AUTOCOVARIATE by 6*6 window size (AUTOCOV)

Model Accuracy
Model selection was based on the following criteria:

1) variability accounted for in the model Nagelkerke R2


which analog to R2 in Ordinary Least Squares (OLS)
regression (Peng et al., 2002; Piorecky and Prescott, 2006),

2) predictive power (Kunkel and Pletscher, 2000; Piorecky
and Prescott, 2006; Ottaviani et al., 2004). Predictive
power was assessed with the classication tables and a
receiver operating characteristic (ROC) plots.
ROC plots assess model success across a full range of
dichotomies and not just at a single cut-o point.

The area under the curve (AUC) of the ROC is a single

measure of overall t, ranging from 0.5 for chance


performance to 1.0 for a perfect t (Osborne et al., 2001;
Piorecky and Prescott, 2006).

Visual comparison based on the eld knowledge and the

true distribution of species .


This approach also was carried out by Augustin et al.
(1996) in model selection for the spatial distribution of red
deer

Model Valida5on
The validation phase is crucial in assessing the accuracy of

any predictions.


In the validation, two types of errors can be detected:

1. an omission error when the model predicts as an unsuitable


habitat, even though the species has indeed been found in
that location

2. a commission error when the model predicts a suitable
habitat but the species has not been reported there.
Commission errors are unavoidable as not all suitable
habitats are likely to be occupied concurrently (Boone and
Krohn, 1999, 2000; Ottaviani et al., 2004).


This is achieved by testing the predicted probability distribution

of a species represented by the habitat suitability model against


evidence recorded in the eld.

To verify that predictions are robust, general, and unbiased,

model predictions must be compared with independent data


sets, i.e., one that has not been previously used in building the
model.

There is wide consensus that models should be validated using

independent data sets to minimize errors such as over-tting


data, over-rating model performance and underestimating the
error rate in subsequent applications (Guisan and Zimmermann,
2000; Lehmann et al., 2002; Ottaviani et al., 2004).

Thank you

Sumber data GIS & RS free download


Digital Elevation Model

http://srtm.csi.cgiar.org/ (Resolusi 90m)


http://asterweb.jpl.nasa.gov/gdem-wist.asp (Resolusi 30m)
Batas administrasi GLOBAL:
http://www.diva-gis.org/gdata
Boundary protected area GLOBAL:
http://www.wdpa.org/
Satellite image
http://glovis.usgs.gov/
Landcover GLOBAL:
http://glcf.umiacs.umd.edu/data/
Data iklim GLOBAL:
http://www.worldclim.org/

You might also like