You are on page 1of 28

TREE DIVERSITY ANALYSIS A manual and software for common statistical methods for ecological and biodiversity studies

Using the BiodiversityR software within the R 2.10.0 environment


R Kindt, World Agroforestry Centre, Nairobi (Kenya)

1. Introduction
The software accompanying the Tree Diversity Analysis manual was developed for the R 2.1.1 environment. Since the publication of the manual at the end of 2005, new versions of the base R environment and its accompanying packages have become available. These changes made it necessary to modify the BiodiversityR software. At the same time that these changes were implemented, the opportunity was taken to develop BiodiversityR into a package that can be installed and is documented in the same way as other R packages. Some new functions were integrated in the new version of the package. Functions that were not directly associated with the graphical user interface (GUI) provided by BiodiversityR were documented separately. The GUI of BiodiversityR is supported by the R Commander package (Figure 1). Options selected from a specific menu generate commands that are placed in the "Script Window" of the R Commander. These commands return results that are shown in the "Output Window" and potentially as graphs (figure 2 and 3). Highlighting some commands and clicking on the "Submit" button re-submits commands and generates new results. We encourage users to read the help files to understand the various functions and their arguments and also to explore modifying some of the options.

This document shows where the Tree Diversity Analysis manual has become outdated. Please read the manual for more information on community ecology and biodiversity analysis.

Figure 1. The graphical user interface (GUI) of the BiodiversityR package is integrated in the R Commander and is available from the right-hand side of the top menu.

Figure 2. The GUI of Biodiversity allows to select different options for calculations, for example to use the exact calculation method for species accumulation curves (these options are available from the BiodiversityR menu option of: BiodiversityR > Analysis of diversity > Species accumulation curves). Clicking on the "OK" button generates commands that are submitted to the Script Window of the R Commander (see Figure 3). Clicking on the "Plot" button generates commands that produce specific graphics based on the results that were obtained earlier (in the example shown below, graphical results use data from the Dune.accum result).

Figure 3. Selection of specific options from a BiodiversityR window result in commands that are shown in the Script Window of the R Commander (the commands shown here were generated by the options shown in Figure 2 after clicking the "OK" button). It is possible to modify these commands in the Script Window and to obtain new results by clicking on the "Submit" button.

2. Main changes in the software


The main changes in the software include the following: Installation Using the package and the graphical user interface after the package was installed Importing data via Excel workbooks or Access databases

2.1. Installation In the new version of the software, the package is installed and loaded as any other package developed within the R statistical environment. An accompanying document (Installation of BiodiversityR in Windows) provides instructions how BiodiversityR can be installed and used under MS Windows. This accompanying document replaces most of the information that is available in Chapter 3: Doing biodiversity analysis with Biodiversity.R of the manual. An important change from the previous instructions for installation is that the step of copying the Biodiversity.R and Rcmdr-menus.txt is not needed anymore (page 34 in the manual). You need the following packages for all options of BiodiversityR. Between brackets I have indicated the version of the packages that I am currently using. The first four packages are essential, although most of the other packages are also frequently used.
BiodiversityR (1.4.2) car (1.2-16) Rcmdr (1.5-3) [Note that this may be the only version of the R-Commander compatible with BiodiversityR] vegan (1.15-4) abind (1.1-0) akima (0.5-3) aplpack (1.2-2) colorspace (1.0-1) effects (2.0-11) ellipse (0.3-5) Hmisc (3.7-0) lmtest (0.9-24) maptree (1.4-5) mgcv (1.5-6) multcomp (1.1-2) mvtnorm (0.9-8) relimp (1.0-1) rgl (0.87) RODBC (1.3-1) sp (0.9-44) [only for one function for training purposes] splancs (2.01-25) [only for one function for training purposes]

2.2. Using the package and the graphical user interface after the package was installed As the software is now a standard package, use the following command to load the package (obviously after the package was installed; alternatively you could use the following menu options: Packages > Load package): library(BiodiversityR) Note that R is case-sensitive, so never use capitals where these are not shown. To access the graphical user interface of the package (still based on the Rcmdr package), use: BiodiversityRGUI() To learn more about the features of the BiodiversityR package, use menu options of: BiodiversityR > Help about BiodiversityR > Help about BiodiversityR, or type: help("BiodiversityRGUI", help_type="html")

2.3. Importing data via Excel workbooks or Access databases A new feature of the updated package is that data can be imported from Excel workbooks or Access databases. To be able to import data for the community and environmental datasets (read Chapter 2: Data preparation for more information about these datasets), data for the environmental data set needs to be available from an Excel worksheet (alternatively an Access table) named environmental (Figure 4). Data for the community data set should either be imported as a matrix (formatted as sites species, with species abundances as cell entries) from an Excel worksheet (alternatively an Access table) named community (Figure 5), or these data should be available in a stacked format (with separate columns for sites, species and abundances) from an Excel worksheet (alternatively an Access table) named stacked (Figure 6). Both data sets should be available from the same Excel workbook (or Access database). More information on importing data from Excel or Access is available from the help provided for the import.from.Excel and import.from.Access functions: help("import.from.Excel", help_type="html") help("import.from.Access", help_type="html") The accompanying document (Installation of BiodiversityR in Windows) also provides instructions on preparing Excel files, as well as some suggestions to avoid problems in importing data. For users that do not have access to MS Excel, we suggest that they use the OpenOffice Calc program to prepare the data and save the data as MS Excel file. The OpenOffice can be obtained from www.openoffice.org . Some users have experienced problems to import data. Some suggestions to avoid problems with importing data are the following: Avoid as much as possible to have spaces in names of variables. Use variable names such as soil_texture or soil.texture rather than soil texture. Try not to use special characters in data sets such as . Avoid capital letters for the names of worksheets or tables Prior to importing data from the stacked data, also replace spaces in names of species (since these will become variable names) Use a strict scheme of using capital letters or not. Especially check whether the number of species after importing data from the stacked format is what you expected. Since R is case sensitive, species names such as Olea_capensis and olea_capensis will be interpreted as different species in R. You can determine the number of species from the number of columns in the community data set or via the menu option of: Biodiversity R > Analysis of diversity > Diversity indices and then opting to calculate the species richness with the calculation method for all sites. Rather than using a scheme of naming sample units as S1, S10 or S100, use a numbering system with leading zeroes such as S001, S010 and S100.

In case that names of sites are not in the same sequence or do not contain the same subset of sample units, use the menu option of: BiodiversityR > Community matrix > Same sites for community/environmental. In some situations, MS Excel imports data from a larger number of columns or rows than the current data range (this seems to be a result of previous presence of data in those columns or rows even if the data was deleted later). You may therefore wish to open a new workbook and copy the desired data ranges in the community, environmental and stacked worksheet.

One method to check what could have gone wrong when trying to import data is to import data via the Rcmdr option of: Data > Import Data > From Excel, Access or dBase data set.

Figure 4. Required format of the Excel workbook with the environmental data set: the name of the worksheet is "environmental" (without capitals), row 1 gives the names of the various variables (preferably without spaces or special characters), column A contains names of sample units, other columns (B-F) document characteristics of sample units. The name of the variable with the labels for sample units (cell A1) should be the same as the name for the variables with labels in sheet "community" (Figure 5) or "stacked" (Figure 6).

Figure 5. Required format of the Excel workbook with the community data set: the name of the worksheet is "community" (without capitals), names for variables (preferably without spaces or special characters) are given in row 1, column A contains labels of sample units, whereas other columns document abundances of species. The name of the variable with the labels for sample units is given in cell A1. Except for the names of sample units, this data set only contains continuous (numeric) variables.

Figure 6. Required format of the Excel workbook with the stacked data set: the name of the worksheet is "stacked" (without capitals), names for variables (preferably without spaces or special characters) are given in row 1, one variable contains labels of sample units (column A), a second variable contains names of species (preferably without spaces or special characters, column B) and a third columns document abundances of species (column C).

3. Main changes in the examples of the manual


The main change in the examples is that the menu options should now be accessed via BiodiversityR and not Biodiversity. Make sure that you select a community data set and an environmental data set (see above and chapter 2 of the manual) before embarking on analysis. Most of the menu options and commands remain the same or they only changed slightly (for example, the option of calculating the first-order Jackknife gamma diversity estimator is now "jack1" whereas it was documented as "Jack.1" in the manual the change reflects a change that was made in the vegan community ecology package that is used to calculate the result). In case that is not clear what option to choose from, please check the changes in the commands. Remember that menu options result in calculations by clicking on the "OK" button. Menu options related to graphical output are invoked by the "Plot" button. I suggest to check the commands that are listed below rather than the commands listed in the guide. Commands should be pasted into the "Script Window" of the R Commander, highlighted and the "Submit" button should be clicked to obtain results. The commands are also available as scripts that are listed in a separate directory (Manual \ Scripts). These scripts can be accessed via the menu option of: File > Open script from the R Console or the menu option of: File > Open script file from the R-Commander. We encourage that users explore importing data into R. However, all data sets can also be imported by loading the workspace of TreeDiversity.RData (available from the Data directory) via the menu option of: File > Load Workspace from the R Console.

Commands for Chapter 1: Sampling


#To load polygons with the research areas: area <- array(c(10,10,15,35,40,35,5,35,35,30,30,10),dim=c(6,2)) landuse1 <- array(c(10,10,15,15,30,35,35,30), dim=c(4,2)) landuse2 <- array(c(10,10,15,15,35,30,10,30,30,35,30,15), dim=c(6,2)) landuse3 <- array(c(10,10,30,35,40,35,5,10,15,30,30,10), dim=c(6,2)) window <- array(c(15,15,30,30,10,25,25,10), dim=c(4,2))

#To plot the research area: plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical position", lwd=2, bty="l") polygon(landuse1) polygon(landuse2) polygon(landuse3)

#To randomly select sample plots in a window: plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical position", lwd=2, bty="l") polygon(landuse1) polygon(landuse2) polygon(landuse3) spatialsample(window, method="random", n=20, xwidth=1, ywidth=1, plotit=T, plothull=T)

#To randomly select sample plots in the survey area: plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical position", lwd=2, bty="l") polygon(landuse1) polygon(landuse2) polygon(landuse3) spatialsample(area, method="random", n=20, xwidth=1, ywidth=1, plotit=T, plothull=F)

#To select sample plots on a grid: plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical position", lwd=2, bty="l") polygon(landuse1) polygon(landuse2) polygon(landuse3) spatialsample(area, method="grid", xwidth=1, ywidth=1, plotit=T, xleft=10.5, ylower=5.5, xdist=1, ydist=1)

#To select sample plots on a grid (alternative): plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical position", lwd=2, bty="l") polygon(landuse1) polygon(landuse2) polygon(landuse3) spatialsample(area, method="grid", xwidth=1, ywidth=1, plotit=T, xleft=12, ylower=7, xdist=4, ydist=4)

#To randomly select sample plots from a grid: plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical position", lwd=2, bty="l") polygon(landuse1) polygon(landuse2) polygon(landuse3) spatialsample(area, method="random grid", n=20, xwidth=1, ywidth=1, plotit=T, xleft=10.5, ylower=5.5, xdist=1, ydist=1)

#To randomly select sample plots from a grid (alternative): plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical position", lwd=2, bty="l") polygon(landuse1) polygon(landuse2) polygon(landuse3) spatialsample(area, method="random grid", n=20, xwidth=1, ywidth=1, plotit=T, xleft=12, ylower=7, xdist=4, ydist=4)

#To select sample plots from a grid with random start: plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical position", lwd=2, bty="l") polygon(landuse1) polygon(landuse2) polygon(landuse3) spatialsample(area, method="random grid", n=20, xwidth=1, ywidth=1, plotit=T, xdist=4, ydist=4)

#To randomly select maximum 10 sample plots from each type of landuse: plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical position", lwd=2, bty="l") polygon(landuse1) polygon(landuse2) polygon(landuse3) spatialsample(landuse1, n=10, method="random", plotit=T) spatialsample(landuse2, n=10, method="random", plotit=T) spatialsample(landuse3, n=10, method="random", plotit=T)

#To randomly select sample plots from a grid within each type of landuse. Within each landuse, the grid has a random starting position: plot(area[,1], area[,2], type="n", xlab="horizontal position", ylab="vertical position", lwd=2, bty="l") polygon(landuse1) polygon(landuse2) polygon(landuse3) spatialsample(landuse1, n=10, method="random grid", xdist=2, ydist=2, plotit=T) spatialsample(landuse2, n=10, method="random grid", xdist=4, ydist=4, plotit=T) spatialsample(landuse3, n=10, method="random grid", xdist=4, ydist=4, plotit=T)

#To calculate sample size requirements: power.t.test(n=NULL, delta=1, sd=1, sig.level=0.05, power=0.8, type="two.sample") power.t.test(n=NULL, delta=0.5, sd=1, sig.level=0.05, power=0.8, type="two.sample") power.anova.test(n=NULL, groups=4, between.var=1, within.var=1, power=0.8) power.anova.test(n=NULL, groups=4, between.var=2, within.var=1, power=0.8)

#To calculate the area of a polygon: areapl(landuse1)

Commands for Chapter 2: Data preparation


#To load data from an external file: new.data <- read.table(file="D://my files/data.txt") new.data <- read.table(file.choose())

#To save data to an external file: write.table(new.data, file="D://my files/data.txt") write.table(new.data, file.choose())

#To summarize the data and check for exceptional cases: summary(dune.env) boxplot(dune.env$A1) points(mean(dune.env$A1),cex=2.5,pch=3) table(dune.env$Management) plot(dune.env$Management) pairs(dune.env)

#To transform the data: dune.ln.transformed <- log(dune+1) dune.ln.transformed dune.squareroot.transformed <- dune^0.5 dune.squareroot.transformed dune.speciesprofile <- decostand(dune,"total") dune.speciesprofile dune.env$A1.standard <- scale(dune.env$A1) summary(dune.env$A1.standard)

#Checking whether data is normally distributed: qq.plot(dune.env$A1) shapiro.test(dune.env$A1) ks.test(dune.env$A1,pnorm)

Commands for Chapter 4: Analysis of species richness


#To calculate the total number of species: Diversity.1 <- diversityresult(dune, index="richness") Diversity.1

#To calculate the total species richness for separate sites: Diversity.2 <- diversityresult(dune, index="richness",method="s") Diversity.2 summary(Diversity.2) Diversity.3 <- diversityresult(dune[1:2,], index="richness") Diversity.3

#To compare the total number of species for various subsets of data: Diversity.4 <- diversitycomp(dune, y=dune.env,factor1="Management", index="richness",method="all") Diversity.4

#To calculate a sample-based species accumulation curve Accum.1 <- accumresult(dune, method="exact") Accum.1 accumplot(Accum.1) Accum.2 <- accumresult(dune, method="random",permutations=1000) Accum.2 accumplot(Accum.2,addit=T,col=2)

#To calculate a sample-based species accumulation curve scaled by the number of accumulated individual plants dune.env$site.totals <- apply(dune,1,sum) Accum.3 <- accumresult(dune, y=dune.env, scale="site.totals",method="exact") Accum.3 accumplot(Accum.3, xlab="pooled individuals")

# To calculate an individual-based species accumulation curve (first scaled by the number of # sites in Accum.4, then by the number of accumulated plants in Accum.5) Accum.4 <- accumresult(dune, method="rarefaction") Accum.4 accumplot(Accum.4) dune.env$site.totals <- apply(dune,1,sum) Accum.5 <- Accum.5 <- accumresult(dune, y=dune.env,scale="site.totals", method="rarefaction") Accum.5 accumplot(Accum.5, xlab="pooled individuals")

#To compare species richness between various subsets in the data using species accumulation curves Accum.6 <- accumcomp(dune, y=dune.env, factor="Management",method="exact") # click in the graph to show where the legend should be placed Accum.6 dune.env$site.totals <- apply(dune,1,sum) Accum.7 <- accumcomp(dune, y=dune.env, factor="Management",scale="site.totals", method="exact", xlab="pooled individuals") # click in the graph to show where the legend should be placed Accum.7

#To calculate a collectors curve Accum.8 <- accumresult(dune, method="collector") Accum.8 accumplot(Accum.8)

#Calculating the expected species richness for the entire survey area Diversity.5 <- diversityresult(dune, index="jack1") Diversity.5 Diversity.6 <- diversityresult(dune, index="jack2") Diversity.6 Diversity.7 <- diversityresult(dune, index="chao") Diversity.7 Diversity.8 <- diversityresult(dune, index="boot") Diversity.8

Commands for Chapter 5: Analysis of diversity


#To calculate and plot a rank-abundance curve: RankAbun.1 <- rankabundance(dune) RankAbun.1 rankabunplot(RankAbun.1, scale="abundance",specnames=c(1:3)) rankabunplot(RankAbun.1, scale="proportion",specnames=c(1:3))

#To model a rank-abundance curve: radfitresult(dune)

#To calculate and plot a Rnyi diversity profile: Renyi.1 <- renyiresult(dune) Renyi.1 renyiplot(Renyi.1, legend=FALSE) renyiplot(Renyi.1, evenness=TRUE, legend=FALSE)

#To calculate and plot a Rnyi diversity profile for each site separately: Renyi.2 <- renyiresult(dune, method="s") Renyi.2 renyiplot(Renyi.2, legend=FALSE) renyiplot(Renyi.2, evenness=TRUE, legend=FALSE)

#To calculate diversity indices for each site: Diversity.1 <- diversityresult(dune, index="Shannon", method="s") Diversity.1 Diversity.2 <- diversityresult(dune, index="Simpson", method="s") Diversity.2 Diversity.3 <- diversityresult(dune, index="Logalpha", ,method="s") Diversity.3

#To compare diversity between subsets of the dataset: Renyi.3 <- renyicomp(dune, y=dune.env, factor="Management", permutations=100) # Click in the graph to show where the legend should be placed Renyi.3

#To calculate accumulation patterns for the Rnyi diversity profile Renyi.4 <- renyiaccum(dune, permutations=100) Renyi.4 persp(Renyi.4) rgl.renyiaccum(Renyi.4)

Commands for Chapter 6: Analysis of counts of trees


#Load the dataset Faramea.txt and give it the name faramea. faramea <- read.table(file.choose()) attach(faramea)

#To calculate a linear regression model: Count.model1 <- lm(Faramea.occidentalis ~ Precipitation, data=faramea, na.action=na.exclude) summary(Count.model1) fitted(Count.model1) predict(Count.model1, interval="confidence") residuals(Count.model1) shapiro.test(residuals(Count.model1)) ks.test(residuals(Count.model1), pnorm) anova(Count.model1,test="F") Count.model2 <- lm(Faramea.occidentalis ~ Age.cat, data=faramea, na.action=na.omit) levene.test(residuals(Count.model2), na.omit(faramea)$Age.cat)

#To plot a linear regression model: plot(Count.model1) termplot(Count.model1, se=T, partial.resid=T, rug=T,terms="Precipitation") library(effects) as.data.frame(effect('Precipitation',Count.model1)) plot(effect("Precipitation", Count.model1))

#To check for the spatial distribution of residuals: surface.1 <- residualssurface(Count.model1, na.omit(faramea),"UTM.EW", "UTM.NS", gam=F, npol=1, plotit=T, bubble=F, fill=F) surface.2 <- residualssurface(Count.model1, na.omit(faramea),"UTM.EW", "UTM.NS", gam=F, npol=2, plotit=T, bubble=F, fill=F) surface.gam <- residualssurface(Count.model1, na.omit(faramea), "UTM.EW", "UTM.NS", gam=T, npol=2, plotit=T, bubble=F, fill=T) summary(surface.1) anova(surface.1) correlogram(surface.1, nint=10) summary(surface.gam)

#To calculate a generalized linear regression model (GLM): Count.model3 <- glm(formula = Faramea.occidentalis ~ Precipitation, family = poisson(),data=faramea, na.action=na.exclude) summary(Count.model3) anova(Count.model3,test="F") predict(Count.model3, type="response", se.fit=T) Count.model4 <- glm(formula = Faramea.occidentalis ~ Precipitation, family = quasipoisson(), data=faramea, na.action=na.exclude) summary(Count.model4) anova(Count.model4,test="F") Count.model5 <- glm.nb(Faramea.occidentalis ~ Precipitation, maxit = 5000, init.theta = 1, data=faramea, na.action=na.exclude) summary(Count.model5) anova(Count.model5,test="F")

#To calculate a generalized additive regression model (GAM): Count.model6 <- gam(Faramea.occidentalis ~ s(Precipitation),family=poisson(), data = na.omit(faramea)) summary(Count.model6) predict(Count.model6, type="response", se.fit=T)

#To calculate a multiple regression model: Count.model7 <- glm.nb(Faramea.occidentalis ~ Precipitation + I(Precipitation^2), maxit = 5000, init.theta = 1, data=faramea, na.action=na.exclude) summary(Count.model7) anova(Count.model7, test="F") Anova(Count.model7, type="II", test="Wald") vif(lm(Faramea.occidentalis ~ Precipitation + I(Precipitation^2), data=faramea, na.action=na.exclude))

Commands for Chapter 7: Analysis of presence or absence of species


faramea <- read.table(file.choose()) attach(faramea)

#To analyse presence or absence by cross-tabs: table1 <- table(Faramea.occidentalis>0, Age.cat) Presabs.1 <- chisq.test(table1) Presabs.1 Presabs.1$observed Presabs.1$expected

#To calculate a generalized linear regression model (GLM): Presabs.model2 <- glm(formula = Faramea.occidentalis>0 ~ Age.cat, family = binomial(link=logit), data = faramea, na.action = na.exclude) summary(Presabs.model2) anova(Presabs.model2,test="F") predict(Presabs.model2, type="response", se.fit=T) null.model <- glm(formula = Faramea.occidentalis>0 ~ 1, family = binomial(link=logit) , data = faramea, na.action = na.exclude) anova(null.model, Presabs.model2, test="Chi") plot(Presabs.model2) termplot(Presabs.model2, se=T, partial.resid=T, rug=T, terms="Age.cat") library(effects) plot(effect("Age.cat", Presabs.model2)) Presabs.model3 <- glm(formula = Faramea.occidentalis>0 ~ Age.cat, family = quasibinomial(link=logit) , data = faramea, na.action = na.exclude) summary(Presabs.model3) Presabs.model4 <- glm(formula = Faramea.occidentalis>0 ~ Elevation, family = quasibinomial(link=logit) , data = faramea, na.action = na.exclude) summary(Presabs.model4)

#To calculate a generalized additive regression model (GAM): Presabs.model5 <- gam(formula = Faramea.occidentalis>0 ~ s(Precipitation) + Geology + Age.cat + s(Elevation), family = quasibinomial(link=logit) , data = faramea, na.action = na.exclude) summary(Presabs.model5)

#To calculate a GLM with several explanatory variables: Presabs.model6 <- glm(formula = Faramea.occidentalis > 0 ~ Precipitation + I(Precipitation^2) + Geology + Age.cat + Elevation + I(Elevation^2), family = binomial(link = logit) , data = faramea, na.action = na.exclude) summary(Presabs.model6) anova(Presabs.model6,test="Chi") drop1(Presabs.model6, test="Chi")

Commands for Chapter 8: Analysis of differences in species composition


#Calculating distance matrices euclidean.distance <- vegdist(dune,method="euclidean") euclidean.distance bray.distance <- vegdist(dune,method="bray") bray.distance

#Transformations of the species data community.hel <- disttransform(dune, method="hellinger") hellinger.distance <- vegdist(community.hel,method="euclidean") hellinger.distance

#Calculating the rank-correlation with the mantel test envir.distance <- vegdist(dune.env$A1, method="euclidean") ecology.distance <- vegdist(dune, method="kul") mantel(envir.distance, ecology.distance, "kendall") plot(envir.distance, ecology.distance)

#Calculating an ANOSIM test ecology.distance <- vegdist(dune, method="kul") anosim(ecology.distance, dune.env$Management)

Commands for Chapter 9: Analysis of ecological distance by clustering


#Calculate and plot agglomerative clustering: library(cluster) distmatrix <- vegdist(dune, method="bray") distmatrix Cluster.1 <- agnes(distmatrix, method="single") summary(Cluster.1) plot(Cluster.1, which.plots=2,main="",sub="",xlab="",ylab="") plot(Cluster.1, which.plots=2, hang=-1,main="",sub="",xlab="",ylab="") Cluster.2 <- agnes(distmatrix, method="single") summary(Cluster.2) Cluster.3 <- agnes(distmatrix, method="complete") summary(Cluster.1)

#Calculate and plot divisive clustering: distmatrix <- vegdist(dune, method="bray") Cluster.4 <- diana(distmatrix) summary(Cluster.4) plot(Cluster.4, which.plots=2,main="",sub="",xlab="",ylab="") plot(Cluster.4, which.plots=2, hang=-1,main="",sub="",xlab="",ylab="")

#Calculating cophenetic correlation (for hierarchical clusters): copheneticdist <- cophenetic(Cluster.1) copheneticdist mantel(distmatrix,copheneticdist,permutations=1000) plot(distmatrix, copheneticdist) abline(0,1)

#Selecting cluster membership from a hierarchical clustering: cutree(Cluster.1,k=4) plot(Cluster.1, which.plots=2, hang=-1,main="",sub="",xlab="",ylab="") rect.hclust(Cluster.1, k=4, border="blue") rect.hclust(Cluster.1, k=4)

#Calculating non-hierarchical clusters: distmatrix <- vegdist(dune,method="bray") Cluster.5 <- kmeans(dune, centers=5, iter.max=100) Cluster.5 Cluster.6 <- pam(distmatrix, k=5) summary(Cluster.6) Cluster.7 <- clara(dune, k=5) summary(Cluster.7) Cluster.8 <- fanny(distmatrix, k=5) summary(Cluster.8)

Commands for Chapter 10: Analysis of ecological distance by ordination


#Calculating a principal component analysis (PCA) Ordination.model1 <- rda(dune) summary(Ordination.model1, scaling=1) plot1 <- ordiplot(Ordination.model1, scaling=1, type="text") plot2 <- ordiplot(Ordination.model1, scaling=2, type="text")

#Calculating the variance of each species of the species matrix inertcomp(Ordination.model1, display="species",statistic="explained", proportional=F)

#Calculating the proportion of variance explained for an ordination graph goodness(Ordination.model1, display="sites", choices=c(1:2),statistic="explained")

#Adding a vector and perpendicular lines for a particular species to an ordination plot ordivector(plot1,"Agrsto",lty=2)

#Calculating correlations among vectors: cor.test(dune[,"Alogen"],dune[,"Agrsto"])

#Calculating the number of ecologically meaningful principal components: PCAsignificance(Ordination.model1,axes=30) screeplot.cca(Ordination.model1,bstick=T)

#Drawing an equilibrium circle plot1 <- ordiplot(Ordination.model1, scaling=1, type="text") ordiequilibriumcircle(Ordination.model1,plot1)

#Calculating a PCA on a transformed matrix Community.1 <- disttransform(dune, method="hellinger") Ordination.model2 <- rda(Community.1) summary(Ordination.model2, scaling=1) plot3 <- ordiplot(Ordination.model2, scaling=1, type="text")

#Calculating a principal coordinates analysis (PCoA) distmatrix <- vegdist(dune,method="bray") Ordination.model3 <- cmdscale(distmatrix, k=nrow(dune)-1,eig=T, add=F) plot4 <- ordiplot(Ordination.model3, type="text") abline(h = 0, lty = 3) abline(v = 0, lty = 3) Ordination.model3 <- add.spec.scores( Ordination.model3, dune,method="pcoa.scores", Rscale=F, scaling=1, multi=0.1) plot4 <- ordiplot(Ordination.model3, type="text") abline(h = 0, lty = 3) abline(v = 0, lty = 3)

#Calculating a non-metric multidimensional scaling (NMS) distmatrix <- vegdist(dune, method="bray") initNMS <- NMSrandom(distmatrix, perm=100, k=2) Ordination.model4 <- postMDS(initNMS, distmatrix) Ordination.model4 <- add.spec.scores( Ordination.model4, dune, method="wa.scores") Ordination.model4 plot5 <- ordiplot(Ordination.model4, type="text") abline(h = 0, lty = 3) abline(v = 0, lty = 3)

#Calculating a correspondence analysis (CA or WA) Ordination.model5 <- cca(dune) summary(Ordination.model5, scaling=1) plot6 <- ordiplot(Ordination.model5, type="text", scaling=1)

#Calculating a redundancy analysis (RDA) Ordination.model6 <- rda(dune ~ Management, dune.env) summary(Ordination.model6, scaling=1) permutest.cca(Ordination.model6, permutations=1000) plot7 <- ordiplot(Ordination.model6, type="text", scaling=1)

#Calculating a canonical correspondence analysis (CCA) Ordination.model7 <- cca(dune ~ Management, dune.env) summary(Ordination.model7, scaling=2) permutest.cca(Ordination.model7, permutations=1000) plot8 <- ordiplot(Ordination.model7, type="text", scaling=1)

#Calculating distance-based redundancy analysis (db-RDA) Ordination.model8 <- capscale(dune ~ Management, dune.env) summary(Ordination.model8, scaling=1) permutest.cca(Ordination.model8, permutations=1000) plot9 <- ordiplot(Ordination.model8, type="text", scaling=1)

#Calculating the correlation between distance in an ordination graph and total distance distmatrix <- vegdist(dune,method="bray") Ordination.model3 <- cmdscale(distmatrix, k=nrow(dune)-1,eig=T, add=F) plot4 <- ordiplot(Ordination.model3, type="text") abline(h = 0, lty = 3) abline(v = 0, lty = 3) distdisplayed(dune, plot4, distx="bray")

#Plotting clustering results onto an ordination graph distmatrix <- vegdist(dune, method="bray") Ordination.model3 <- cmdscale(distmatrix, k=nrow(dune)-1,eig=T, add=F) plot4 <- ordiplot(Ordination.model3, type="text") cluster <- hclust(distmatrix, method="single") ordicluster(plot4, cluster,col="green")

#Plotting quantitative environmental variables onto an ordination graph distmatrix <- vegdist(dune, method="bray") Ordination.model3 <- cmdscale(distmatrix, k=nrow(dune)-1,eig=T, add=F) plot4 <- ordiplot(Ordination.model3, type="text") abline(h = 0, lty = 3) abline(v = 0, lty = 3) attach(dune.env) fitted <- envfit(plot4, data.frame(A1), permutations=100) fitted plot(fitted) ordibubble(plot4, A1) ordisurf(plot4, A1)

#Plotting categorical environmental variables onto an ordination graph distmatrix <- vegdist(dune, method="bray") Ordination.model3 <- cmdscale(distmatrix, k=nrow(dune)-1,eig=T, add=F) plot4 <- ordiplot(Ordination.model3, type="n") abline(h = 0, lty = 3) abline(v = 0, lty = 3) ordisymbol(plot4, dune.env, "Management", legend=T) # Click in the figure where the legend should be placed attach(dune.env) fitted2 <- envfit(plot4, data.frame(Management), permutations=100) fitted2 plot(fitted2) plot4 <- ordiplot(Ordination.model3, type="p") abline(h = 0, lty = 3) abline(v = 0, lty = 3) ordihull(plot4, Management) plot4 <- ordiplot(Ordination.model3, type="p") abline(h = 0, lty = 3) abline(v = 0, lty = 3) ordispider(plot4, Management) plot4 <- ordiplot(Ordination.model3, type="p") abline(h = 0, lty = 3) abline(v = 0, lty = 3) ordiellipse(plot4, Management)

You might also like