Cassey and Mcardle

ENVIRONMETRICS
Environmetrics, 10, 261278 (1999)
AN ASSESSMENT OF DISTANCE SAMPLING TECHNIQUES

FOR ESTIMATING ANIMAL ABUNDANCE
PHILLIP CASSEY1* AND BRIAN H. MCARDLE2
1
School of Environmental and Marine Sciences, University of Auckland, Private Bag 92019, Auckland, New Zealand
Biostatistics Unit, School of Biological Sciences, University of Auckland, Private Bag 92019, Auckland, New Zealand
SUMMARY
Line transects have been widely applied for the estimation of animal abundance because they are regarded
as simple, economical, and relatively precise. The recent development of automated techniques for the
estimation of animal density from distance sampling data allows greater potential for eld biologists and
wildlife managers to become involved in the analytical summary of their research. An assessment was made
of the ability of program DISTANCE to produce unbiased estimates of density in spite of potential sources of
error from the estimation of transect and population density. Populations were simulated to investigate the
robustness of program DISTANCE to changes in the density, distribution, and detection of animals across
sampling areas and transects. It is concluded that if distance sampling data is collected reliably from a
random sample of possible primary sampling units (PSUs) it can be expected that estimates of density will
be presented accurately and with correct estimates of variance. If the proportion of the study area surveyed
by transects is large however, then the presence of large between PSU variation will cause the variance
estimates from program DISTANCE to be a sizeable overestimate. Copyright # 1999 John Wiley & Sons, Ltd.
KEY WORDS
distance sampling; estimating animal abundance; line transect methods; program
DISTANCE
1. INTRODUCTION
Knowledge of animal abundance is critical to the ecological theory and practice of studies in both
population biology (Krebs 1985; Soule 1986) and wildlife resource monitoring (Parmenter et al.,
1989; Sinnary and Hebrand 1991; Conroy et al., 1995). Important technological advances
relevant to population estimation are steadily contributing to the methodology (Seber 1992).
Consequently, methods for estimating animal abundance are continuing to develop, requiring
explicit evaluation before they can be applied with condence in practical situations (Southwell
1994). Additionally, scientists and wildlife managers are being made increasingly accountable for
the accuracy and reliability of their results (Verner and Milne 1989).
Distance sampling methods (Buckland et al., 1993) have developed intensively over the last
20 years and have been widely applied for sampling animal populations (Burnham and Anderson
1984). The term distance sampling includes both line and point transect methods (circular plots,
trapping webs, and cue counting). The development of ideas in this paper, however, explicitly
follows the theory of only line transect estimators based on the measurement of perpendicular
distance from the transect.
* Correspondence to: P. Cassey, CRC-Trem School of AES, Grith University, Nathan, Brisbane, QLD 4111, Australia.
CCC 11804009/99/03026118$17.50
Copyright # 1999 John Wiley & Sons, Ltd.
Received 1 May 1998

Revised 15 February 1999
262
P. CASSEY AND B. H. MCARDLE
In line transect sampling an observer traverses a randomly located line and measures or
estimates the perpendicular distance to each detected animal. In practice, a number of lines of
lengths l1 , l2 . . . ; lk are used and their total length is denoted as L. The central concept of distance
sampling is the hypothesised detection function (g(y):
gy the probability of detecting an animal; given that it is at a distance y
from the random line or point
prob fdetection j distance yg:
Density is estimated from the encounters on individual lines, using either separate detection
functions, or more often a single detection function for all the lines. If a single line was sampled
the variance would be estimated using models hypothesised to describe the distribution of
animals as well as the sampling operation (Burnham et al., 1980; Buckland et al., 1993). In the
practical case of a number of lines being sampled, the variance is estimated using the between line
information from replicate line transects which does not depend on an assumed spatial
distribution of animals (Thompson 1992; Jensen 1996).
Distance sampling derives from classical closed or nite population sampling involving total
counts of randomly chosen primary sampling units, PSUs (Seber 1982). An estimate of total
population size is obtained by multiplying the average density per sampling area, estimated from
the PSUs, by the total area of the population. The key dierence is that distance sampling
methods allow for the fact that many objects will remain undetected, as long as they are not on
the line. This follows from the assumption that animals on the line are detected with certainty,
g(0) 1, and that detection is subsequently a monotonically non-increasing function of distance
away from the transect. Rather than a total count (e.g. strip transects), distance sampling
methods estimate the number of animals within each PSU. The distribution of animals with
respect to the transect is assumed to be uniform (Turnock and Quinn 1991). It is therefore critical
that the line be placed randomly with respect to the distribution of animals (Buckland et al.,
1993). Density is estimated based on the hypothetical shape of g(y).
^
^ value
Estimated density, D^ n f0=2L,
where n number of animals detected; and f0
of the probability density function ( pdf) of detected distances at zero distance. Because the area
of a strip of incremental width dy at distance y from the line is independent of y, it may be proven
that the density function is identical in shape to the detection function only rescaled so that it
integrates to unity (Buckland et al., 1993).
2. PROGRAM DISTANCE
The recent development of automated methods for estimating density from line transect data
include a number of hypothetical models for estimating the shape of the detection function
(Laake et al., 1994). Buckland (1992) developed a procedure which employs the best available
parametric function, a(y), called the key function, to give a rst approximation to the density and
then improves the t using polynomial adjustments, called series expansions. If it is a good t, it
needs no adjustment; the worse the t, the greater the adjustments required.
The programme DISTANCE by Laake et al. (1994) provides four arbitrary key functions
(uniform, half-normal, hazard-rate, and negative exponential) which are adjusted by the exible
form of series expansions using one or more parameters to improve the t of the model to the
distance data. Three series expansions are included and considered by Buckland et al. (1993):
ESTIMATING ANIMAL ABUNDANCE
263
(1) the cosine series, (2) simple polynomials, and (3) hermite polynomials. All three expansions
are linear in their parameters. Program DISTANCE allows any key function to be used with any
series expansion. The algebraic forms of the various recommended models are listed in Buckland
et al. (1993).
Model criteria and selection is described by Burnham et al. (1980) and Buckland et al. (1993).
The choice of model is governed by three successive steps:
(1) The addition of adjustment terms to a given key function is judged by likelihood ratio tests
which indicate whether an extra term can signicantly improve the t of the model of g(y)
to the data.
(2) Programme DISTANCE implements Akaike's Information Criterion (AIC) which provides a
quantitative method for model selection, whether or not models are hierarchical. Model
selection is treated within an optimisation rather than a hypothesis testing framework.
Burnham and Anderson (1992) illustrated the application of AIC, and Akaike (1973)
presented the theory underlying the method.
(3) Goodness of t provides a warning of a problem in the data or the selected model
assumption structure, which should be investigated through closer examination of the data
or by exploring other models and tting options. Although Buckland et al. (1993) prefer
AIC for model selection, an unusually small goodness of t statistic is a useful warning that
the model t might be poor, or that an assumption might be seriously violated.
Two independent sources of error are associated with the estimates of density from program
The rst is concerned with estimating density from within a single primary sampling
unit, PSU. Rather than a total count, population density is estimated from a sample of detections
within a PSU. Total density will be biased if estimates of PSU densities are consistently
inaccurate. Such a situation arises if the detection function is unrealistic or if it is unable to
provide adequate t to the true detection process (Buckland 1985).
It is often assumed that a PSU describes an area of homogenous terrain, habitat, and species
density. Unfortunately nature is very rarely so uniform and unless sampling occurs in an orchard,
is inherently patchy and heterogenous across even very local scales. Models are therefore required
to be pooling robust (Burnham et al., 1980) so that reliable estimates of density can be obtained
from data pooled over the many factors, external and internal to the population, aecting
detection.
The second source of error comes from the precision of the density estimate for the entire
population from the sample of PSU estimates. If it is assumed that there is no variation in
encounter rate among complete transect lines within a sample, then
DISTANCE.

n
E i d
li
which is constant for all lines. When equation (1) is true, the appropriate estimator of d is d n/
L, and
c
varn
L

k
X
n
n 2
li i
li L
i1
k1
2
264
Buckland et al. (1993) state that the critical point is that if equation (1) fails to the extent that
there is substantial variation in per line encounter rate, with

di E
ni
li
c
then var(n)
from equation (2) is not appropriate as it includes both stochastic (residual) variation
and the structural variation among d1 , . . . , dk . They further state that this latter variation does
c
not belong in var(n),
as it represents large-scale variation in true object density over the study
area.
3. BASIS FOR STUDY
The aim of this study is to assess the ability of current methods for analysing distance sampling
data (namely program DISTANCE), to produce unbiased estimates of density in spite of both
potential sources of error. A number of researchers have examined the accuracy of estimators
when the assumption of uniform distribution of distances from objects to the transect is violated
(Laake 1978; Smith 1979; Turnock and Quinn 1991). If the animals sampled are animals that
move in response to the observer before being detected, then the distribution will be non-uniform
and the estimates of density can be accordingly biased.
If the certain detection of animals on or near the transect-line is consistently less than 1,
estimates of density will likewise be consistently biased (underestimates of density). Researchers
have examined estimators for left-truncated data sets (Alldredge and Gates 1985) and unimodal
detection functions for the cases where visibility is restricted close to the line but perfect detection
exists parallel to the transect (Quang and Lanctot 1991). If maximum detection can be estimated
(Pollock and Kendall 1987), then g(0) can be rescaled accordingly. Methods have been proposed
which employ reliable ancillary data from a second observer such that they are robust to directional movement of animals and do not require certain detection on the centre-line (Buckland
and Turnock 1992).
The investigation of data sets which do not explicitly fail the assumptions of uniform distribution or certain detection on the line will provide an understanding of how reliable DISTANCE is to
the forms of data the program is expected to analyse.
4. SIMULATION MODELS
The development of realistic models for simulating the detection process are discussed in detail
by Buckland (1985) and Buckland et al. (1993). Throughout this study ve models are presented
as plausible functions for the detection process. It is acknowledged that nature is unlikely to ever
be as simple as the hypothesised functions developed here. All of the detection functions satisfy
the assumption of g(0) 1 and apart from the uniform model (g(y) 1), are monotonically
decreasing with increasing distance. All of the functions are truncated at y 25. The quarter
circle and triangle have g(25) 0, whereas the half-gaussian and the negative exponential models
have an extra parameter controlling their shape. Alternative forms of all ve models are displayed
in Figure 1, and their proportions of detection in Figure 2.
Animals are uniformly distributed within a PSU between y 0 and y 25 from the transect
line. A population was sampled with a total of ve PSUs. Each animal was allocated a probability
265
Figure 1. Plausible functions for simulating the detection process: (a) detection function all; uniform, 1/4 circle, 1/2
gaussian, triangle and negative exponential; (b) 1/2 gaussian, alternative form model, for ve dierent values of the shape
parameter t; and (c) negative exponential, alternative form model, for ve dierent values of the shape parameter t
266
Figure 2. The eectiveness (average number detected/sample abundance) for each of the simulated processes of detection,
at detecting animals within a PSU
of detection based on the form of the simulated detection function of that particular PSU.
Animals were detected through the sampling of their probabilities from a random distribution
between 0 and 1. Distance from the line for each detection was stored in a DISTANCE data set.
Detection functions and distances were simulated through the application SAS interactive matrix
language (SAS/IML; SAS Institute 1994) and estimates of density and variance obtained through
program DISTANCE. Computer programs for the implementation of detections and the harvesting
of DISTANCE output are available from the principal author. Two studies were conducted and their
details are given below and in Table I.
5. STUDY DESCRIPTIONS
The rst study can be imagined as sampling from a whole population such that the ve transect
lines sampled make up the entire population of possible PSUs. This design could apply to a small
nature reserve where researchers counting e.g. Lepidoptera search the entire area by traversing a
number of equal strips. This study examines the situation where each of the ve transect lines
sampled has an unique abundance and/or detection function. Such a situation occurs during the
sampling of a variety of heterogenous habitats where researchers are forced to sample in patches
of varying density and detection.
By changing the shape of the detection process (i.e. uniform to negative exponential) the
dierences common between habitats, sampling times and observers can be simulated. The focus
is the individual transect and how easily program DISTANCE is `fooled' by changes in the encounter
rate and detection functions for transects within a sample.
267
Table I. Details of the abundance and detection function shapes, for each scenario, in two studies of the
sources of potential error in the analysis of simulated distance sampling data. Five PSUs were sampled by
single transects in each separate scenario
Study one
Density
Function
High
(Alternative
(Alternative
(Alternative
(Alternative
form)
form)
form)
form)
Low
(Alternative
(Alternative
(Alternative
(Alternative
Study two
Distribution
Log normal
form)
form)
form)
form)
Replication
Primary sampling unit

abundance
All
Uniform
1/4 circle
1/2 gaussian (t 10)
Triangle
Negative exponential (t 11)
Uniform
1/4 circle
1/2 gaussian (t 10)
Triangle
1/2 gaussian
Negative exponential
1/2 gaussian
All
All
50
50
50
50
50
50
50
50
50
50
50
50
50
50
50
50
50
200
200
200
200
200
200
50100200400800
50100200400800
50100200400800
50100200400800
50100200400800
50100200400800
50100200400800
200
200
nexpon (800)unif (50)
unif (800)nexpon (50)
All
Uniform
1/4 circle
1/2 gaussian (t 10)
Triangle
Uniform
1/4 circle
1/2 gaussian (t 10)
Triangle
1/2
1/2
All
All
50
50
50
50
50
50
50
50
50
50
50
50
50
50
50
50
50
40
40
40
40
40
40
10204080120
10204080120
10204080120
10204080120
10204080120
10204080120
10204080120
40
40
nexpon (120)unif (10)
unif (120)nexpon (10)
Replication
Primary sampling unit

abundance
Function
Uniform
All
500
500
500
avg (310) stddev (304.96)

avg (310) stddev (304.96)
avg (310) stddev (304.96)
Program DISTANCE allows the user to estimate detection functions from either the pooled data
set (i.e. one estimate of f(0) for the ve transect lines Detection by All) or for each individual
transect (i.e. ve separate estimates of f(0) Detection by Sample). Since adequate density
268
estimation requires a relatively large number of detections (e.g. 6080; Buckland et al., 1993),
most researchers studying biological populations will never encounter densities such that
detection functions can be estimated from individual lines. In both cases however, estimates of Di
are kept separate and the variance estimate of pooled density is calculated from individual
transect lines. For line transects with replicate line i of length li .
k
X
D^
i1
li D^ i
Seventeen cases were examined in study one (Table I). For function àll', each PSU had a
separate detection function (Figure 1(a)), in contrast to each PSU having the same detection
function with, or without, the same PSU abundance. The alternative form case examines the two
detection functions (half-gaussian and negative exponential) which can change shape between
the ve PSUs sampled (Figure 1(b,c)).
Study one is split into two scenarios `high density' and `low density'. In the rst scenario,
population sizes are large enough for program DISTANCE to adequately estimate detection
functions for individual PSUs (e.g. 200 animals per PSU). In the second scenario, sample sizes
are reduced to a considerably more realistic level such that they are suitable for pooled analysis
but cause DISTANCE to output ``WARNING: number of observations is small. Do not expect
reasonable results'' when estimating density and its components from individual transect lines
(e.g. 40 animals per PSU).
The second study examines the situation where the ve transect lines sampled make up a
random sample of all possible PSUs. The simulation samples transects from a large area where
the population is patchily distributed so that densities in each transect are from some stochastic
distribution with mean m and standard deviation s.
Although individual PSUs are expected to have highly variable encounter rates they are all
sampled from the one distribution. Most sampling of wildlife populations is expected to occur in
this manner, e.g. marine sampling of cetaceans, aerial surveys of large mammals, or bird counts
through extensive wildland regions.
For this study PSUs are sampled from a log-normal distribution with mean 310 and a standard
deviation of 304.959. The mean and standard deviation were chosen from the high density
scenario in study one based on PSU abundances of 50800 (see Table I). Means and errors of
density estimates are calculated from a sample of 500 replicates. Three scenarios were investigated; the rst two had the same detection function per PSU (uniform and negative exponential best and worst case scenarios) and the third had a separate detection function for each
PSU.
In both studies, the variance from density estimates produced by program DISTANCE:

c
c
varf0
2 varn
c
varD D
3
n2
f02
is compared with the variance from individual line densities:
^
va^rD
k
X
^ 2
li D^ i D
Lk 1
i1
4
269
and a simple stratied variance approach:

^
va^rD
1X
c D^ i
var
k2
In the rst study where the entire population has been sampled, it is predicted that if program
estimates variance from the variation between lines, then in situations where the withinline variation is small but the between-line variation is large, estimators calculated from
equations (3) and (4) will overestimate the observed variance of the density estimates. The best
estimate of observed variance would be approximated by the stratied estimator which only takes
into account the variation within transect lines.
Bias in the density estimates was calculated as the
DISTANCE
javerage observed estimate of density over all simulations true population densityj:
Means and errors of density estimates were calculated from a sample of 50 replicates in
study one and 500 in study two. t-tests were used to test for the existence of signicant bias.
The proportion of bias in both the density estimate and its contribution to the error [i.e.
error bias2 var (density estimate)] were calculated. A 5 per cent level is used for detecting a
signicant bias. It is noted that even without bias there are likely to be a number of signicant
values at the 5 per cent level when such a large number of t-tests are used.
Density estimates and levels of precision are compared for (1) Detection by Sample where
DISTANCE ts a separate detection function for each transect; and (2) Detection by All where
DISTANCE ts a single detection function for the pooled sample of PSUs.
6. RESULTS
6.1. Study one Estimator bias
6.1.1. Detection by Sample
Bias was signicant in 12 of the 17 cases for `high density' and six of the 17 cases for `low density'.
The percentage bias in the density estimates were only greater than 5 per cent in three cases for
`high density' and seven (one non-signicant) for `low density' (Table II). In both scenarios the
negative exponential where the shape parameter was varied (alternative form negative exponential) had the greatest percentage of bias in the estimate of density [6.8 per cent and 14.7 per
cent respectively]. The percentage of bias in the error of the density estimate was less than 7.6 per
cent in all cases where bias was non-signicant and ranged from 8.4 per cent to 54.1 per cent when
bias was signicant.
6.1.2. Detection by All
The percentage of bias in the estimates of density were greater than 5 per cent in only two cases
for `high density' and seven cases for `low density' (Table III). The greatest percentage of bias in
the estimate of density was 9.5 per cent from 1/2 gaussian (avg 310) in `low density'.
Bias was signicant in eight cases for `high density' and 11 cases for `low density'. The
àlternative form negative exponential' had the greatest percentage of bias in `high density'
(9.3 per cent) and was second highest in `low density' (8.2 per cent). The percentage of bias in the
270
Table II. Results for study one (Detection by Sample) bias trials testing for the existence of signicant bias.
Related statistics include the size of bias, and its percent inuence in the estimate of density and error.
Signicant bias (in bold) is detected at the 5 per cent level
Density
Function
PSU
abundance
Average
density
estimate
SD of
density
estimate
High
All
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-expon
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-expon
1/2 gauss
Neg-expon
1/2 gauss
Neg-expon
All
All
200
200
200
200
200
200
Avg
Avg
Avg
Avg
Avg
Avg
Avg
200
200
Avg
Avg
All
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-expon
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-expon
1/2 gauss
Neg-expon
1/2 gauss
Neg-expon
All
All
40
40
40
40
40
40
Avg
Avg
Avg
Avg
Avg
Avg
Avg
40
40
Avg
Avg
200.217
204.883
209.823
210.822
198.543
190.470
314.929
321.638
322.661
312.940
302.545
326.808
301.118
208.875
186.490
315.731
314.998
39.365
42.167
39.432
40.726
38.676
36.734
56.016
57.171
60.419
53.459
41.195
57.237
51.590
41.682
34.113
53.225
55.370
16.804
4.965
10.173
17.045
18.318
23.160
6.372
19.844
23.758
35.901
26.854
27.605
29.711
16.652
20.346
25.941
13.425
6.315
2.014
3.894
6.828
8.731
11.519
2.509
6.074
11.032
7.801
12.855
9.616
15.598
6.513
10.620
8.278
5.206
AF
AF
AF
AF
Low
AF
AF
AF
AF
310
310
310
310
310
310
310
310
310
54
54
54
54
54
54
54
54
54
Bias
0.22
4.88
9.82
10.82
1.46
9.53
4.93
11.64
12.66
2.94
7.46
16.81
8.88
8.87
13.51
5.73
5.00
0.64
2.17
0.57
0.73
1.32
3.27
2.02
3.17
6.42
0.54
2.80
3.24
2.41
1.68
5.89
0.77
1.37
Percent
bias of
density
estimate
0.109
2.441
4.912
5.411
0.729
4.765
1.590
3.754
4.084
0.949
2.405
5.422
2.865
4.437
6.755
1.849
1.612
1.589
5.417
1.421
1.816
3.310
8.165
3.733
5.873
11.887
1.002
5.194
5.995
4.463
4.205
14.718
1.434
2.536
Bias as %
of error
0.017
49.678
48.756
29.144
0.642
14.733
37.911
25.980
22.467
0.680
7.291
27.446
8.357
22.470
31.034
4.745
12.389
1.023
54.145
2.128
1.141
2.293
7.582
39.707
21.763
25.677
0.489
4.632
10.364
2.378
6.373
23.872
0.885
6.597
error of density was less than 3.5 per cent in all cases where bias was non-signicant and ranged
from 7.8 per cent to 43.8 per cent when bias was signicant.
6.2. Study one estimator precision
6.2.1. Detection by Sample
All three methods for estimating variance were reasonable approximations when PSU
abundances were equal, regardless of whether or not detection functions were similar between
271
Table III. Results for study one (Detection by All) bias trials testing for the existence of signicant bias.
Related statistics include the size of bias, and its percent inuence in the estimate of density and error.
Signicant bias (in bold) is detected at the 5 per cent level
Density
Function
PSU
abundance
Average
density
estimate
SD of
density
estimate
High
All
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-expon
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-expon
1/2 gauss
Neg-expon
1/2 gauss
Neg-expon
All
All
200
200
200
200
200
200
Avg
Avg
Avg
Avg
Avg
Avg
Avg
200
200
Avg
Avg
All
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-expon
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-expon
1/2 gauss
Neg-expon
1/2 gauss
Neg-expon
All
All
40
40
40
40
40
40
Avg
Avg
Avg
Avg
Avg
Avg
Avg
40
40
Avg
Avg
197.566
201.506
207.026
201.361
196.635
190.943
311.376
318.607
315.432
304.257
304.667
319.721
280.458
206.807
181.439
317.746
314.391
39.556
40.956
41.637
42.311
40.660
38.597
54.841
58.154
54.154
53.795
53.819
58.045
50.179
42.416
36.716
52.639
56.672
22.319
4.672
19.055
11.873
32.474
24.695
4.788
23.345
17.690
43.926
29.741
35.037
35.599
25.422
21.218
41.345
22.300
6.428
2.293
5.505
6.816
8.650
9.295
2.823
8.532
10.562
10.342
13.693
10.018
11.295
6.812
9.271
9.438
5.585
AF
AF
AF
AF
Low
AF
AF
AF
AF
310
310
310
310
310
310
310
310
310
54
54
54
54
54
54
54
54
54
Bias
2.43
1.51
7.03
1.36
3.36
9.06
1.38
8.61
5.43
5.74
5.33
9.72
29.54
6.81
18.56
7.75
4.39
0.44
0.96
1.64
2.31
0.66
1.40
0.84
4.15
5.15
0.20
0.18
4.04
3.82
2.42
3.28
1.36
2.67
Per cent
bias of
density
estimate
1.217
0.753
3.513
0.680
1.682
4.529
0.444
2.776
1.752
1.853
1.720
3.136
9.530
3.404
9.281
2.499
1.417
1.109
2.389
4.094
5.778
1.650
3.508
1.558
7.693
9.545
0.379
0.335
7.490
7.076
6.040
8.211
2.520
4.947
Bias as per
cent of
error
1.199
9.590
12.184
1.323
1.084
12.069
7.772
12.181
8.777
1.714
3.176
7.283
41.271
6.818
43.847
3.457
3.806
0.484
15.056
8.281
10.501
0.591
2.273
8.314
19.475
19.552
0.040
0.018
14.263
10.457
11.376
11.354
2.078
18.927
lines (Table IV). When PSU abundances were unequal, the observed standard deviation of the
density estimates was only approximated by the stratied estimate of standard error and greatly
overestimated by the other two methods. In 29 out of the 34 cases the stratied estimate had a
lower standard error than the observed standard deviation of density estimates, particularly for
larger values.
272
Table IV. Results for study one (Detection by Sample) precision trials showing the variance estimates from
four dierent sources. The variation of the observed density is described by the standard deviation of the
estimates of density and compared to estimates from a stratied approach, program DISTANCE, and an
estimator of variation between lines
Density
Function
PSU
abundance
Average
density
estimate
SD of
density
estimate
Average
stratied
SE of
estimate
High
All
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-Expon
Uniform
1/3 circle
1/2 gauss
Triangle
Neg-expon
1/2 gauss
Neg-expon
1/2 gauss
Neg-Expon
All
All
200
200
200
200
200
200
Avg
Avg
Avg
Avg
Avg
Avg
Avg
200
200
Avg
Avg
All
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-expon
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-expon
1/2 gauss
Neg-expon
1/2 gauss
Neg-expon
All
All
40
40
40
40
40
40
Avg
Avg
Avg
Avg
Avg
Avg
Avg
40
40
Avg
Avg
200.217
204.883
209.823
210.822
198.543
190.470
314.929
321.638
322.661
312.940
302.545
326.808
301.118
208.875
186.490
315.731
314.998
39.365
42.167
39.432
40.726
38.676
36.734
56.016
57.171
60.419
53.459
51.195
57.237
51.590
41.682
34.113
53.225
55.370
16.804
4.965
10.173
17.045
18.318
23.160
6.372
19.844
23.758
35.901
26.854
27.605
29.711
16.652
20.346
25.941
13.425
6.315
2.014
3.894
6.828
8.731
11.519
2.509
6.074
11.032
7.801
12.855
9.616
15.598
6.513
10.620
8.278
5.206
12.403
6.996
10.311
12.131
13.272
18.597
8.712
13.233
14.716
17.573
22.278
19.597
28.283
13.610
18.847
20.192
11.896
5.364
3.177
3.783
5.433
6.427
8.572
3.661
5.387
8.077
6.694
9.394
8.568
13.466
5.794
7.850
7.456
4.935
AF
AF
AF
AF
Low
AF
AF
AF
AF
310
310
310
310
310
310
310
310
310
54
54
54
54
54
54
54
54
54
DISTANCE
Average SE
of estimate
14.143
3.520
10.168
14.820
15.651
18.697
138.192
143.179
140.297
142.005
136.620
144.055
136.492
14.953
19.910
137.416
139.350
5.880
1.247
3.646
6.106
6.904
9.721
20.760
22.584
24.869
22.057
22.813
24.086
23.505
6.382
8.266
20.490
22.579
14.240
3.608
10.285
14.925
15.744
18.784
138.329
143.342
140.443
142.160
136.752
144.199
136.633
15.052
19.996
137.581
139.494
5.897
1.266
3.665
6.125
6.923
9.736
20.789
22.609
24.895
22.082
22.834
24.114
23.530
6.398
8.281
20.513
22.609
average SE
of estimate
6.2.2. Detection by All

The observed standard deviation for Detection by All was not well approximated by any method
(Table V). In 30 out of the 34 cases the stratied estimate underestimated the observed standard
deviation of density estimates.
In situations where there was large-scale variation in animal density between PSUs, program
DISTANCE considerably overestimated the observed standard deviation of the density estimates.
273
Table V. Results for study one (Detection by All) precision trials showing the variance estimates from four
dierent sources. The variation of the observed density is described by the standard deviation of the
estimates of density and compared to estimates from a stratied approach, program DISTANCE, and an
estimator of the variation between lines
Density
Function
PSU
abundance
Average
density
estimate
SD of
density
estimate
Average
stratied
SE of
estimate
High
All
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-expon
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-expon
1/2 gauss
Neg-expon
1/2 gauss
Neg-expon
All
All
200
200
200
200
200
200
Avg
Avg
Avg
Avg
Avg
Avg
Avg
200
200
Avg
Avg
All
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-expon
Uniform
1/4 circle
1/2 gauss
Triangle
Neg-expon
1/2 gauss
Neg-expon
1/2 gauss
Neg-expon
All
All
40
40
40
40
40
40
Avg
Avg
Avg
Avg
Avg
Avg
Avg
40
40
Avg
Avg
197.566
201.506
207.026
201.361
196.635
190.943
311.376
318.607
315.432
304.257
304.667
319.721
280.458
206.807
181.439
317.746
314.391
39.556
40.956
41.637
42.311
40.660
38.597
54.841
58.154
59.154
53.795
53.819
58.045
50.179
42.416
36.716
52.639
56.672
22.319
4.672
19.055
11.873
32.474
24.695
4.788
23.345
17.690
43.926
29.741
35.037
35.599
25.422
21.218
41.345
22.300
6.428
2.293
5.505
6.816
8.650
9.295
2.823
8.532
10.562
10.342
13.693
10.018
11.295
6.812
9.271
9.438
5.585
8.853
6.409
8.250
9.074
9.731
11.498
7.976
11.213
11.997
13.445
16.383
15.642
17.871
10.084
11.753
14.723
11.163
3.922
2.944
3.562
4.274
4.441
5.204
3.393
4.584
5.259
5.353
7.024
6.453
7.470
4.536
5.210
5.225
4.811
AF
AF
AF
AF
Low
AF
AF
AF
AF
310
310
310
310
310
310
310
310
310
54
54
54
54
54
54
54
54
54
DISTANCE
Average SE
of estimate
35.839
0.788
8.712
8.932
10.989
14.487
136.727
140.797
138.523
134.224
134.450
76.088
69.688
43.530
39.507
97.372
175.099
7.917
0.415
3.178
4.333
4.943
6.830
20.716
22.178
22.765
20.886
22.246
13.508
14.893
10.043
10.221
14.526
28.453
34.743
0.000
3.316
5.157
5.904
6.333
136.987
140.453
138.307
133.641
133.409
74.135
65.820
42.181
36.357
95.895
174.899
6.947
0.000
1.416
2.560
2.613
3.449
20.714
21.795
22.405
20.271
20.630
11.748
11.208
8.879
7.762
13.633
28.080
average SE
of estimate
The overestimation of density variance all occurred where the true abundance varied between
PSUs (i.e. `high density' avg 310 and `low density' avg 54). In one situation Detection by All
( function all; avg 310 and 54) the estimate of variance is smaller than expected due to an
àpparent' reduced variability between PSUs where the function which saw the most (uniform)
was in the PSU of the lowest density.
274
Table VI. Results for study two bias trials, testing for the existence of signicant bias, and variance trials.
Related statistics include the size of bias, and its percent inuence in the estimate of density and error. The
variation of the observed dierences is described by the standard deviation of the estimates of density and
compared to estimates from program DISTANCE
Detection Function
method
PSU
abundance
Sample
Uniform
Neg-expon
All
Avg 310
Avg 310
Avg 310
All
Uniform
Neg-expon
All
Avg 310
Avg 310
Avg 310
Expected
SE
Average
density
estimate
SD
Percent
bias of
density
estimate
136.38
136.38
136.38
136.38
136.38
136.38
314.925
296.554
319.262
312.064
300.309
314.728
127.359
131.872
133.035
127.073
132.589
132.505
1.589
4.337
2.988
0.666
3.126
1.525
Bias as % DISTANCE
of error average
SE of
estimate
0.150
1.031
0.483
0.003
0.053
0.013
104.831
104.333
110.873
104.528
104.371
116.211
6.3. Study two Estimator bias

In study two, bias was non-signicant for two of the three cases in Detection by All and for all
three cases for Detection by Sample (Table VI). The proportion of bias in the estimates for
density in both cases was less than 0.05 and in the error of density less than 0.02.
6.4. Study two Estimator precision
The observed standard deviation of density estimates closely approximated the true standard
error of the sampling distribution (Table VI). Although estimates of standard error from program DISTANCE were reasonable approximations to the observed standard deviation of estimated
densities, they were underestimates in all six cases.
7. DISCUSSION
If the study is seen to examine the situation where all possible PSUs are sampled, the only source
of uncertainty will come from estimating density based on the adequate t of detection functions
to the true detection process. The proportion of bias in the estimates of density was never greater
than 0.15, even though bias was signicant in over half the cases. The majority of signicant bias
is due to a higher proportion of bias in the error of each of the biased estimates of density. The
uniform detection function, for example, has a certain probability of detecting animals (i.e. every
animal is detected) yet in each case it was signicantly biased. In both `high density' and `low
density', uniform (200 and 40) had the highest proportion of bias in the total error of the estimate
of density for Detection by Sample. This is because there is very little variation in the estimate of
density, and bias consequently forms a much greater percentage of the total error. When program
DISTANCE is consistent in its choice of detection function and subsequently its estimate of density,
the variance of estimates is low and the proportion of bias in the error greater, as is the
signicance of the bias.
The result that with relatively small sample size (i.e. possibly as few as 10 in a PSU) the estimate
of density never contained as much as 15 per cent bias is particularly reassuring for researchers
collecting line transect data. Even in situations where each transect had a dierent detection
275
function, and PSUs varied in abundance over an order of magnitude, the results for the
percentage of bias within the estimate of density were less than 5 per cent. This was true even
when the function which saw the least (negative exponential) was in the PSU of the highest
density, and vice versa.
It is noted that the negative exponential model which had the greatest proportion of bias in the
estimate of density has been previously questioned as to its suitability as a model for the process
of detection (Burnham et al., 1980). Results show (Table II) that even when the true detection
process and function are negative exponential, the model underestimated density every time and
was negatively biased in over half the instances. Indeed it is explicitly stated that the negative
exponential model is included within DISTANCE purely for the salvaging of extremely spiked data
(Buckland et al., 1993) and consideration should be given to the use of the hazard-rate model for
data that appear spiked.
The percentage of bias in the estimates of density from study one show that program DISTANCE
oers a reliable method of automated detection function choice and estimation of density from
line transect distance data. The accuracy of estimates is variable, but considering the range of
situations program DISTANCE was required to estimate density for, the level of bias is consistently
minimal.
The situation where each PSU has an individual abundance and dierent detection function is,
in fact, highly unlikely in any reasonably sized habitat and population. If indeed sampling does
occur across such a process, stratication procedures are recommended and can be included
through program DISTANCE. The apparent underestimate of the stratied variance estimator
is largely due to its use of a Poisson error term. The Poisson estimate of within variance
is inadequate if the selected detection function is wrong. The stratied estimate of variance is
therefore unreliable as an estimate of Detection by All, where only one detection function is
chosen across a number of dierent PSUs.
In Detection by Sample situations where the abundance between PSUs was highly dierent,
only the stratied estimate of variance was able to approximate the observed standard deviation
of the density estimates. The limited performance of standard error estimates from the program
DISTANCE reects the fact that variance is being estimated as though sampling occurred from a
random sample of possible PSUs. These PSUs are assumed to belong to a population with
distribution mean m and standard deviation s. In this scenario, where the PSUs represent the
entire population, this assumption is false, so between PSU variation (as noted by Buckland et al.,
1993) biases the estimate of variance.
Although estimates of variance from study two were consistent underestimates, they are not
considered dissimilar to the true and observed variances of density. Indeed they are markedly
improved from study one and indicate that, as was expected, the program DISTANCE estimator of
variance is denitely incorporating the variation between transects. Therefore, even though it
assumes that density is homogenously distributed according to a stochastic process with mean m
and standard deviation s (Buckland et al., 1993), program DISTANCE can cope quite adequately
with a heterogenous process where there is large variation between transect densities.
It is not apparent why the estimated variances were consistently underestimates of both the
true and observed variances of density. We suggest that it is due to some characteristic of the lognormal distribution from which sampling occurred (McArdle et al., 1990). However, this has not
been investigated and requires further study before any general conclusions can be drawn.
Because program DISTANCE includes no information on the error associated with the repeat
sampling of individual transects, variance has to be estimated from the encounter rate between
276
transects. If the actual density of animals around each line is highly variable, it has been shown
that the estimate of variance from DISTANCE is an overestimate of the variance associated with the
estimates of density. Because the stratied approach used here is based on the assumption of
Poisson error, it is inadequate as a robust alternative for the sampling of natural populations.
Future study into the ability of estimators, based on the information from repeat sampling of the
same line, would be useful in providing an estimate of the within-line variation and dierences in
the spatial distribution of animals between transects. Such a design is practical in many situations
where a single sample of PSUs will not provide adequate detections for the reliable estimation
of f(0).
8. APPLICATION TO WILDLIFE POPULATIONS
Despite a growing literature on the subject, in large part, the problem of estimating animal
abundance remains unsolved. Past eorts toward comprehensive and systematic estimation of
density (D) or population size (N) have usually been inadequate (Otis et al., 1978). Distance
sampling however, is promoted as an ecient and cost eective method for the reliable estimation of animal abundance (Burnham et al., 1980). The recent release of program DISTANCE (Laake
et al., 1994) provides an automated procedure for the analysis of distance sampling data and
estimation of population density.
Researchers interested in collecting distance sampling data should refer to Buckland et al.
(1993) for an understanding of the concepts of distance sampling and the design and protocol of
relevant eld studies. This study has investigated the ability of program DISTANCE to accurately
estimate population abundance. Although it is understood that in a number of practical
situations detection on the transect will not always be certain nor animals always located prior to
moving due to the presence of the observer, neither of these conditions have been dealt with. It is
therefore assumed that the design of studies and collection of data is carried out carefully and
measurements made without error. Although exact distances were used throughout the simulations, the analysis of ungrouped data is only recommended when distances are measured to a
high degree of accuracy (Buckland 1985). It is rarely possible to achieve such precision in the
eld, especially with mobile animal populations, and distances should instead be recorded in
intervals, e.g. between 0 5 5 5 10 5 17.5 5 25 5 1 .
The small proportion of bias in program DISTANCEs estimates of density provide researchers
with considerable condence when establishing studies to collect distance sampling data.
Although it is recommended that sampling is designed to produce encounter rates as large as
possible, there was no consistent increase in bias when possible encounter rates were reduced by
up to an order of magnitude. Similarly, whilst in a number of sampling situations, such as highly
variable forested habitats, it may appear that the true process of detection is frequently changing
between PSUs, the results show no consistent loss of precision through estimating sample
densities with a single detection function. This is highly encouraging for researchers planning
studies over populations of heterogenous terrain as it indicates that the possible models advanced
by program DISTANCE are suitably robust to the exibility of the true detection process and to the
pooling over dierent factors aecting detection.
As expected, the performance of the estimates of precision depends on what proportion of the
study area has been surveyed by the transects. If that proportion is large, then the presence of
large between-PSU variation will cause the variance estimates from program DISTANCE to be a
sizeable overestimate of the true value, while the stratied estimate, which relies on the Poisson
277
assumption, tends to be an underestimate. We suggest that in this situation stratifying the study
area and then subdividing strata into replicate PSUs would, if the number of detections was large
enough, provide precision estimates that did not need to assume Poisson distribution but avoid
the incorporation of between-strata variance.
If the encounter rate does not allow subdivision of strata, then the problem can be partially
accommodated by aligning the transects so that they lie parallel to any hypothetical gradient
within the strata (e.g. Cassey 1997). This reduces the between-transect variance (at the expense of
increasing the within-transect variance).
When there is little variation between real transect densities, the variance estimate from
program DISTANCE appears adequate. When the PSUs are a random sample from a large
population of available PSUs then the variance estimates seem quite reliable. Buckland et al.'s
concerns with the assumptions of homogeneity therefore seem overly cautious.
REFERENCES
Akaike, H. (1973). Ìnformation theory and an extension of the maximum likelihood principle'. In
International Symposium on Information Theory, 2nd edn, eds. B. N. Petran and F. Csaaki. Budapest:
Akadeemiai Kiadi, 267281.
Alldredge, J. R. and Gates, C. E. (1985). `Line transect estimators for left truncated distributions'.
Biometrics 41, 273280.
Buckland, S. T. (1985). `Perpendicular distance models for line transect sampling'. Biometrics 41, 177195.
Buckland, S. T. (1992). `Fitting density functions using polynomial'. Applied Statistics 41, 6376.
Buckland, S. T. and Turnock, B. J. (1992). À robust line transect method'. Biometrics 48, 901909.
Buckland, S. T., Anderson, D. R., Burnham, K. P. and Laake, J. L. (1993). Distance Sampling: Estimating
Abundance of Biological Populations. London: Chapman and Hall.
Burnham, K. P. and Anderson, D. R. (1984). `The need for distance data in transect counts'. Journal of
Wildlife Management 18, 12481254.
Burnham, K. P. and Anderson, D. R. (1992). `Data-based selection of an appropriate biological model: the
key to modern data analysis'. In Wildlife 2001: Populations, eds. D. R. McCullough and R. H. Barrett.
London: Elsevier Science, 1630.
Burnham, K. P., Anderson, D. R. and Laake, J. L. (1980). Èstimation of density from line transect
sampling of biological populations'. Wildlife Monograph No. 72.
Cassey, P. (1997). Èstimating animal abundance: an assessment of distance sampling techniques for New
Zealand populations'. MSc. Thesis, University of Auckland.
Conroy, M. J., Cohen, Y., James, F. C., Matsinos, Y. G. and Maurer, B. A. (1995). `Parameter estimation,
reliability, and model improvement for spatially explicit models of animal populations'. Ecological
Applications 5, 1719.
Jensen, A. L. (1996). `Subsampling with line transects for estimation of animal abundance'. Environmentrics
7, 283289.
Krebs, C. J. (1985). Ecology: The Experimental Analysis of Distribution and Abundance, 3rd edn. New York:
Harper Collins.
Laake, J. L. (1978). `Line transect estimators robust to animal movement'. MSc. Thesis, Utah State
University.
Laake, J. L., Buckland, S. T., Anderson, D. R. and Burnham, K. P. (1994). DISTANCE User's Guide. Fort
Collins: Colorado Cooperative Fish and Wildlife Research Unit, Colorado State University.
Otis, D. L., Burnham, K. P., White, G. C. and Anderson, D. R. (1978). `Statistical inference from capture
data on closed animal populations'. Wildlife Monographs 62, 1135.
McArdle, B. H., Gaston, K. J. and Lawton, J. H. (1990). `Variation in the size of animal populations:
patterns, problems, and artefacts'. Journal of Animal Ecology 59, 439454.
Paramenter, R. R., MacMahon, J. A. and Anderson, D. R. (1989). Ànimal density estimation using a
trapping web design: eld validation experiments'. Ecology 70, 169179.
278
Pollock, K. H. and Kendall, W. L. (1987). `Visibility bias in aerial surveys: a review of estimation
procedures'. Journal of Wildlife Management 51, 502509.
Quang, P. X. and Lanctot, R. B. (1991). À line transect model for aerial surveys'. Biometrics 47, 10891102.
SAS, Institute (1994). SAS User's Guide for Personal Computers: Version 6.10 Edition. Cary, NC: SAS
Institute Inc.
Seber, G. A. F. (1982). The Estimation of Animal Abundance and Related Parameters. New York: Macmillan.
Seber, G. A. F. (1992). À review of estimating animal abundance II'. International Statistics Review 60,
129166.
Sinnary, A. S. M. and Hebrand, J. J. (1991). À new approach for detecting visibility bias for the xed-width
transect method'. African Journal of Ecology 29, 222228.
Smith, G. E. (1979). `Some aspects of line transect sampling when the target population moves'. Biometrics
35, 323329.
Soule, M. E. (1986). Conservation Biology: the Science of Scarcity and Diversity. Sunderland, MA: Sinauer
Associates.
Southwell, C. (1994). Èvaluation of walked line transect counts for estimating macropod density'. Journal
of Wildlife Management 58, 348356.
Thompson, S. K. (1992). Sampling. New York: Wiley.
Turnock, B. H. and Quinn II, T. J. (1991). `The eect of responsive movement on abundance estimation
suing line transect sampling'. Biometrics 47, 701715.
Verner, J. and Milne, K. A. (1989). `Coping with sources of variability when monitoring population trends'.
Annales Zoologici Fennici 26, 191199.

Cassey and Mcardle

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cassey and Mcardle

Uploaded by

Copyright:

Available Formats

ENVIRONMETRICS

Environmetrics, 10, 261278 (1999)

AN ASSESSMENT OF DISTANCE SAMPLING TECHNIQUES

distance sampling; estimating animal abundance; line transect methods; program

Received 1 May 1998

P. CASSEY AND B. H. MCARDLE

Environmetrics, 10, 261278 (1999)

ESTIMATING ANIMAL ABUNDANCE

P. CASSEY AND B. H. MCARDLE

Environmetrics, 10, 261278 (1999)

ESTIMATING ANIMAL ABUNDANCE

Environmetrics, 10, 261278 (1999)

P. CASSEY AND B. H. MCARDLE

Environmetrics, 10, 261278 (1999)

ESTIMATING ANIMAL ABUNDANCE

Primary sampling unit

Primary sampling unit

avg (310) stddev (304.96)

Environmetrics, 10, 261278 (1999)

P. CASSEY AND B. H. MCARDLE

ESTIMATING ANIMAL ABUNDANCE

and a simple stratied variance approach:

Environmetrics, 10, 261278 (1999)

P. CASSEY AND B. H. MCARDLE

Environmetrics, 10, 261278 (1999)

ESTIMATING ANIMAL ABUNDANCE

Environmetrics, 10, 261278 (1999)

P. CASSEY AND B. H. MCARDLE

6.2.2. Detection by All

Environmetrics, 10, 261278 (1999)

ESTIMATING ANIMAL ABUNDANCE

Environmetrics, 10, 261278 (1999)

P. CASSEY AND B. H. MCARDLE

6.3. Study two Estimator bias

Environmetrics, 10, 261278 (1999)

ESTIMATING ANIMAL ABUNDANCE

Environmetrics, 10, 261278 (1999)

P. CASSEY AND B. H. MCARDLE

Environmetrics, 10, 261278 (1999)

ESTIMATING ANIMAL ABUNDANCE

Environmetrics, 10, 261278 (1999)

P. CASSEY AND B. H. MCARDLE

Copyright # 1999 John Wiley & Sons, Ltd.

Environmetrics, 10, 261278 (1999)

You might also like