You are on page 1of 5

STAT 431

Geoffrey Michael Williams


Computing Project
21 Oct 2015
Background: In the Barro Colorado Island 50 hectare tropical forest dynamics plot in Panam,
ecologists have been collecting data for all individuals > 10 cm diameter-at-breast-height (DBH)
of all 252 species of trees that occur on the plot since 1982, at roughly 5 year intervals. The data
is available online through the Smithsonian Tropical Research Institutes website. Ecologists are
generally interested in tracking changes in the species composition of the forest community, and
may want to know which species are increasing or decreasing in abundance. For a given species,
abundance is calculated as the number of individuals in the plot. Over the course of the 30 years
the plot data has been collected, changes in abundance for each species are approximately linear,
because tree lifespans (>200 yrs) are much longer than 30 years. 252 linear regressions were
performed in R. Each regression treated species abundance as the response variable and the
corresponding years 82, 85, 90, 95, 00, and 05 as the independent variables. Next, each model
was checked for linearity with a two-tailed t-test (alpha=0.002) of Beta 1, the average change in
number of individuals for that species per year. The species that passed the test were plotted with
the regression line in color-coated scatter plots (Appendix 1). In each check for linearity, the null
hypothesis was that the coefficient Beta 1 is equal to zero, and was tested by comparing the
magnitude of the t-statistic for Beta 1 to critical t (df=4, alpha/2=0.001) = 7.173.
Results: 43 species of trees showed various rates of increase or decline that were statistically
significant for alpha=0.002. Fitted models for all 43 species had an R2 of at least 0.93 and an Fstatistic greater than the critical F(1, 4, alpha=0.002) = 51.5. The average change in number of
individuals per year among species ranged from -24.4 to 28.4. Data summarized in Appendix 2.
Discussion: The linear model was chosen because it approximates changes in abundance for a
given species accurately over a short time scale; over longer time scales, population growth and
decay is nonlinear. The major limitation of the chosen analysis is the small time series sample
sizes for the abundance data used in the regressions; there are only 6 time points for each species.
This may explain the high F-statistics and Beta 1 t-statistics. To account for this shortcoming in
the data, alpha was chosen at 0.002 instead of 0.05. Six time points should be sufficient because
we are assuming recruitment and mortality are constant, as explained in the following paragraph.
Another concern is larger overall abundance could randomly result in higher incidence of change
in abundance. However, this is countered by the observation of consistent trends over time.
Rate change in abundance is equal to the difference in recruitment (new saplings >10cm DBH
per year) rate and mortality rate. Assuming negligible variation in recruitment and mortality,
additional major fluctuation in abundance would not have been observed under a more frequent
sampling scheme. Estimated Beta 1 and abundance could be inputed as parameters into a model
of population growth over longer time scales. Another question that could be asked is if changes
in abundance have resulted in changes in which species are the most or the least abundant.
Tropical forests are composed of many rare species which contribute to very high diversity, and
few hyperdominant species that account for a disproportionate amount of total abundance.

Appendix 1: x axis: time

y axis: abundance (# of individuals) 34 changing spp

Appendix 2: List of species with statistically significant positive (+) linear coefficients
[1] Alseis blackiana
Cassipourea elliptica
[4] Drypetes standleyi
Eugenia oerstediana
[7] Garcinia intermedia
Hirtella triandra
[10] Inga thibaudiana
Lacmellea panamensis
[13] Pouteria reticulata
Protium tenuifolium
[16] Tabernaemontana arborea Tachigali versicolor
[19] Trichilia pallida
Xylopia macrantha

Cupania seemannii
Faramea occidentalis
Inga acuminata
Laetia procera
Spondias radlkoferi
Tetragastris panamensis

List of species with statistically significant negative (-) linear coefficients


[1] Adelia triloba
[4] Casearia arborea
[7] Ficus tonduzii
[10] Hasseltia floribunda
[13] Lacistema aggregatum
[16] Platymiscium pinnatum
[19] Siparuna guianensis
[22] Trophis racemosa

Astrocaryum standleyanum
Casearia sylvestris
Guarea 'fuzzy'
Hirtella americana
Lindackeria laurina
Platypodium elegans
Sloanea terniflora
Zuelania guidonia

Beilschmiedia pendula
Dendropanax arboreus
Guatteria dumetorum
Inga cocleensis
Lonchocarpus heptaphyllus
Poulsenia armata
Trichilia tuberculata

All the species names, rate change in # of trees/yr (Beta 1), R-squared, and F statistic
1
Adelia triloba
2
Alseis blackiana
3
Astrocaryum standleyanum
4
Beilschmiedia pendula
5
Casearia arborea
6
Casearia sylvestris
7
Cassipourea elliptica
8
Cupania seemannii
9
Dendropanax arboreus
10
Drypetes standleyi
11
Eugenia oerstediana
12
Faramea occidentalis
13
Ficus tonduzii
14
Garcinia intermedia
15
Guarea 'fuzzy'
16
Guatteria dumetorum
17
Hasseltia floribunda
18
Hirtella americana
19
Hirtella triandra
20
Inga acuminata
21
Inga cocleensis
22
Inga thibaudiana
23
Lacistema aggregatum
24
Lacmellea panamensis
25
Laetia procera
26
Lindackeria laurina
27 Lonchocarpus heptaphyllus
28
Platymiscium pinnatum
29
Platypodium elegans
30
Poulsenia armata
31
Pouteria reticulata
32
Protium tenuifolium
33
Siparuna guianensis
34
Sloanea terniflora

-2.03539445628998
9.30703624733475
-3.05373134328358
-1.56460554371002
-3.09808102345416
-1.18592750533049
1.28102345415778
1.16162046908316
-1.61791044776119
5.1590618336887
2.5863539445629
28.4409381663113
-1.21151385927505
2.25245202558635
-2.32196162046908
-4.07547974413646
-3.43965884861407
-0.249466950959488
10.9462686567164
1.8409381663113
-1.25501066098081
1.86993603411514
-0.527078891257996
0.922814498933902
0.280597014925373
-1.5091684434968
-1.79957356076759
-0.867803837953092
-0.95181236673774
-12.4660980810235
2.66098081023454
2.6272921108742
-0.322814498933902
-0.850746268656716

0.981604298259277
0.970441712163511
0.970477528480617
0.94963403471829
0.970613594178035
0.972020156888916
0.975708533542084
0.945117729325721
0.99390418374449
0.952832347092786
0.937443176583637
0.930186450234486
0.986503546632395
0.928617826970584
0.988590274724697
0.945349868078739
0.957492002792694
0.953844224256867
0.989417645823737
0.945098116775584
0.977891547595055
0.957793424786226
0.936019410337475
0.947329495110514
0.982089552238806
0.983781013360689
0.96667520703147
0.940852855745627
0.931774211648524
0.987321749880273
0.945803158798333
0.964587729878788
0.950858271178847
0.963792621220982

213.442099049975
131.325835587207
131.490011307953
75.4187101870688
132.117360667843
138.960058214748
160.666880319054
68.8832818113459
652.187757691084
80.8038804870926
59.941865675898
53.2954679060872
292.374157791785
52.0363999895552
346.578116780642
69.1928699781216
90.0999403122319
82.6630434782609
373.987726868184
68.8572457824381
176.926277730114
90.7719633668507
58.5189611583545
71.9438324806808
219.333333333333
242.62453265144
116.030753192599
63.6279480679112
54.6288571616554
311.500953382842
69.8050393954877
108.955198475232
77.3972177201512
106.474719101124

35
36
37
38
39
40
41
42
43

Spondias radlkoferi
0.33773987206823
Tabernaemontana arborea
2.8865671641791
Tachigali versicolor
1.15991471215352
Tetragastris panamensis
3.55479744136461
Trichilia pallida 0.727931769722814
Trichilia tuberculata -24.4243070362473
Trophis racemosa -1.45756929637527
Xylopia macrantha
4.63283582089552
Zuelania guidonia -0.282302771855011

0.928784648187633
0.950302666909604
0.930120288047633
0.948067298157644
0.970007440216116
0.992421305207608
0.984579417986296
0.984183190422545
0.934422174840085

52.1676646706586
76.4872163406485
53.2412204951066
73.0227594193397
129.366409163559
523.795367088204
255.393581671913
248.895502118301
56.9962283782027

Appendix 3. R Code (also attached as a document)


abundance<-read.csv("Abundance.csv")

#50 ha plot data

t=c(0,3,8,13,18,23) # this is a vector representing time (1982,85,90, etc.)


tbar=mean(t)
# average time
n=length(abundance$Species.Name) #number of regressions to be performed
#linear regression determining change in abundance of species over time
#storing values in the matrix "species", cols = beta0, beta1, t-stat for beta1, R2, F
species=matrix(nrow=n,ncol=5)
for (i in 1:n) {
#y is the vector of abundance values taken from the data set for the entry with index i
y=c(abundance[i,3],abundance[i,4],abundance[i,5],abundance[i,6],abundance[i,7],abundance[i,8])
ybar=mean(y)
#the mean
sxy=sum((t-tbar)*(y-ybar))
#sum of covariances
sxx=sum((t-tbar)^2)
#sum of squares of variance in the time vector
beta1=sxy/sxx
beta0=ybar-beta1*tbar
MSRes=sum((y-(beta0+beta1*t))^2)*(1/4)

#mean residual sum of squares, df=4

T=beta1/((MSRes/sxx)^.5)
SSReg=sum((beta0+beta1*t-ybar)^2)
SSTot=sum((y-ybar)^2)

#t value for the linear coefficient


#sum of squares from the regression
#sum of squares of variance in abundance data

R2=SSReg/SSTot
F=SSReg/MSRes
species[i,]=c(beta0,beta1,T,R2,F)
}
tc=7.173

#critical two-sided t value for alpha=0.002 and df=4

#this outputs all the data created in the first for loop from the regressions in a matrix
#and also a list of species whose abundance correlated positively or negatively with time
#at a significance level of at least 0.002 (t test coefficient) and range of linear coefficient
species
#outputs species that changed positively and negatively
abundance$Species.Name[which(abs(species[,3])>tc & species[,2]>0)]
abundance$Species.Name[which(abs(species[,3])>tc & species[,2]<0)]
#species with significant linear coefficients
spp=abundance$Species.Name[which(abs(species[,3])>tc)]
#average changes in abundance in trees/year, R2 and F statistics for significant spp
data=species[which(abs(species[,3])>tc),c(2,4,5)]
#output species and linear coefficient, and stats listed in previous comment, together
as.data.frame(cbind(as.vector(spp),data),row.names=c("Species","Beta 1","R_Squared","F"))
#range of values for average change in abundance in trees/year
range(species[which(abs(species[,3])>tc),2])

#the rest of this script outputs a color coated 4-pane graphical representation of the models
#for those Species whose abundances showed a linear correlation with time, with alpha=0.002
#based on a two tailed t-test for linearity where critical t (df=4, alpha/2=0.001) = 7.173
layout(matrix(c(1,2,3,4),2,2,byrow=TRUE))
set1=which(abs(species[,3])>tc & abundance$X1982 > 500) #chosing rows
z=length(set1)
ylim=range(subset.data.frame(abundance[set1,],select=X1982:X2005))
plot.new()
plot.window(range(t),c(0,ylim[2]))
axis(1)
axis(2)
palette(rainbow(z))
leg=5
for (j in 1:z) {
points(t,subset.data.frame(abundance[set1[j],],select=X1982:X2005),col=j)
text(leg,species[set1[j],1]+species[set1[j],2]*leg-50,abundance$Species.Name[set1[j]],col=j)
lines(c(0,23),c(species[set1[j],1],species[set1[j],1]+species[set1[j],2]*23),col=j)
leg=leg+5
if (leg > 20){leg=5}
}
set2=which(abs(species[,3])>tc & abundance$X1982 > 190 & abundance$X1982 < 500)
z=length(set2)
ylim=range(subset.data.frame(abundance[set2,],select=X1982:X2005))
plot.new()
plot.window(range(t),c(ylim[1],ylim[2]))
axis(1)
axis(2)
palette(rainbow(z))
for (k in 1:z) {
points(t,subset.data.frame(abundance[set2[k],],select=X1982:X2005),col=k)
text(leg,species[set2[k],1]+species[set2[k],2]*leg-20,abundance$Species.Name[set2[k]],col=k)
lines(c(0,23),c(species[set2[k],1],species[set2[k],1]+species[set2[k],2]*23),col=k)
leg=leg+5
if (leg > 20){leg=5}
}
set3=which(abs(species[,3])>tc & abundance$X1982 > 75 & abundance$X1982 < 190)
z=length(set3)
ylim=range(subset.data.frame(abundance[set3,],select=X1982:X2005))
plot.new()
plot.window(range(t),c(0,ylim[2]))
axis(1)
axis(2)
palette(rainbow(z))
for (j in 1:z) {
points(t,subset.data.frame(abundance[set3[j],],select=X1982:X2005),col=j)
text(leg,species[set3[j],1]+species[set3[j],2]*leg-10,abundance$Species.Name[set3[j]],col=j)
lines(c(0,23),c(species[set3[j],1],species[set3[j],1]+species[set3[j],2]*23),col=j)
leg=leg+3
if (leg > 20){leg=5}
}
set4=which(abs(species[,3])>tc & abundance$X1982 < 75)
z=length(set4)
ylim=range(subset.data.frame(abundance[set4,],select=X1982:X2005))
plot.new()
plot.window(range(t),c(0,ylim[2]))
axis(1)
axis(2)
palette(rainbow(z))
for (k in 1:z) {
points(t,subset.data.frame(abundance[set4[k],],select=X1982:X2005),col=k)
text(leg,species[set4[k],1]+species[set4[k],2]*leg-5,abundance$Species.Name[set4[k]],col=k)
lines(c(0,23),c(species[set4[k],1],species[set4[k],1]+species[set4[k],2]*23),col=k)
leg=leg+5
if (leg > 20){leg=5}
}

You might also like