You are on page 1of 17

DNA barcoding and metabarcoding of standardized

samples reveal patterns of marine benthic diversity


Matthieu Leray and Nancy Knowlton1
National Museum of Natural History, Smithsonian Institution, Washington, DC 20013

Contributed by Nancy Knowlton, December 31, 2014 (sent for review October 29, 2014; reviewed by Naiara Rodriguez-Ezpeleta and Robert Toonen)

Documenting the diversity of marine life is challenging because Coast. In addition to their commercial value and their role in
many species are cryptic, small, and rare, and belong to poorly maintaining water quality, oyster beds shelter considerable diversity
known groups. New sequencing technologies, especially when because of their 3D complexity, essentially the nontropical equiva-
combined with standardized sampling, promise to make compre- lent of coral reefs. They are also, like coral reefs, highly threatened,
hensive biodiversity assessments and monitoring feasible on a with up to 85% having been lost due to anthropogenic impacts (8).
large scale. We used this approach to characterize patterns of We report analyses of a nested set of autonomous reef
diversity on oyster reefs across a range of geographic scales monitoring structures (ARMS), which provide surfaces and
comprising a temperate location [Virginia (VA)] and a subtropical spaces for mobile and sessile organisms to settle on or shelter
location [Florida (FL)]. Eukaryotic organisms that colonized multi- within (SI Text, section I and Fig. S1). ARMS were deployed
layered settlement surfaces (autonomous reef monitoring struc- for about 6 mo on the ocean side of the Eastern Shore of
tures) over a 6-mo period were identified by cytochrome c oxidase Virginia (VA) and in the Indian River Lagoon in Florida (FL).
subunit I barcoding (>2-mm mobile organisms) and metabarcod- At each location, there were three replicates ∼2 m apart at
ing (sessile and smaller mobile organisms). In a total area of each of three sites ∼100 m apart (total of 18 ARMS; Fig. S1A).
∼15.64 m2 and volume of ∼0.09 m3, 2,179 operational taxonomic Four fractions were analyzed separately: sessile organisms
units (OTUs) were recorded from 983,056 sequences. However, growing on the plates and three fractions of organisms retained
only 10.9% could be matched to reference barcodes in public data- by 2-mm, 500-μm, and 106-μm sieves. We sequenced the cy-
bases, with only 8.2% matching barcodes with both genus and tochrome c oxidase subunit I (COI) gene for each specimen of
species names. Taxonomic coverage was broad, particularly for the >2-mm animals (barcoding). The remaining fractions were
animals (22 phyla recorded), but 35.6% of OTUs detected via meta- homogenized, and COI amplicons were analyzed from bulk
barcoding could not be confidently assigned to a taxonomic samples using HTS (metabarcoding). Sequences were clustered
group. The smallest size fraction (500 to 106 μm) was the most in operational taxonomic units (OTUs) and identified to the
diverse (more than two-thirds of OTUs). There was little taxonomic lowest possible taxonomic level using nucleotide BLAST
overlap between VA and FL, and samples separated by ∼2 m were (BLASTn) searches against public databases or by phylogenetic
significantly more similar than samples separated by ∼100 m. assignment when no direct match could be found. The effec-
Ground-truthing with independent assessments of taxonomic tiveness of the metabarcoding approach was assessed for
composition indicated that both presence–absence information the sessile and 2-mm to 500-μm fractions by comparing num-
and relative abundance information are captured by metabarcod- bers of sequences with point counts and estimates of total
ing data, suggesting considerable potential for ecological studies
and environmental monitoring. Significance

|
oyster reefs operational taxonomic units | meiofauna | ARMS | High-throughput DNA sequencing methods are revolutionizing
cryptic species our ability to census communities, but most analyses have fo-
cused on microbes. Using an environmental DNA sequencing

U nderstanding the diversity of life in the sea continues to


challenge marine scientists because samples typically con-
tain many rare species, most of them small and difficult to
approach based on cytochrome c oxidase subunit 1 primers, we
document the enormous diversity and fine-scale geographic
structuring of the cryptic animals living on oyster reefs, many
identify (1). Moreover, recent estimates suggest that between of which are rare and very small. Sequence data reflected both
33% and 91% of all marine species have never been named (2, the presence and relative abundance of organisms, but only
3). These constraints have limited our ability to investigate pat- 10.9% of the sequences could be matched to reference bar-
terns of diversity beyond a few indicator groups (4), most often codes in public databases. These results highlight the enormous
conspicuous macroinvertebrates and fish. For this reason, mo- numbers of marine animal species that remain genetically un-
lecular methods, particularly high-throughput sequencing (HTS) anchored to conventional taxonomy and the importance of
approaches, hold considerable promise not only for fundamental standardized, genetically based biodiversity surveys to monitor
understanding of diversity but also for biodiversity monitoring in global change.
the context of global change (5).
Author contributions: M.L. and N.K. designed research; M.L. performed research; M.L.
Molecular methods are particularly powerful when combined contributed new reagents/analytic tools; M.L. analyzed data; and M.L. and N.K. wrote
with standardized sampling, allowing for direct comparisons the paper.
across space and through time. In the ocean, analyzing standard Reviewers: N.R.-E., AZTI-Tecnalia; and R.T., Hawaii Institute of Marine Biology.
volumes of readily sampled material (e.g., seawater, sediments) The authors declare no conflict of interest.
has a long tradition, and, increasingly, HTS approaches are being Freely available online through the PNAS open access option.
applied to these samples (6). Complex hard substrates provide
Data deposition: The sequences reported in this paper have been deposited in the Gen-
greater challenges for consistent sampling, which can be met ei- Bank database (accession nos. KP253982–KP255345), the Barcode Of Life Data Systems
ther by collecting approximately standard volumes (e.g., of rubble) (doi: dx.doi.org/10.5883/DS-ARMS), and the Dryad Digital Repository (doi: doi.org/10.
or by deploying settlement structures (e.g., ref. 7). 5061/dryad.d0r79).
Here, we combine standardized sampling with molecular di- 1
To whom correspondence should be addressed. Email: knowlton@si.edu.
versity assessments for samples from oyster reefs from one tem- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.
perate location and one subtropical location on the US Atlantic 1073/pnas.1424997112/-/DCSupplemental.

2076–2081 | PNAS | February 17, 2015 | vol. 112 | no. 7 www.pnas.org/cgi/doi/10.1073/pnas.1424997112


DNA per OTU, respectively. Noneukaryotic sequences were OTUs could be assigned to a higher taxonomic group using the
not analyzed. Bayesian phylogenetic approach (45.7% in VA and 55.9% in
FL). However, 40.9% (VA) and 28.3% (FL) of OTUs could not
Results be confidently assigned to any taxonomic group using this ap-
Patterns of Diversity and Abundance. Organisms >2 mm had rel- proach. Taxonomic coverage was very broad, with 22 phyla of
atively low diversity, with FL samples having about 1.7-fold more animals, five major groups of protists, two major groups of Fungi,
OTUs in total than VA samples, and also more phyla repre- and three major groups of Plantae. Among the identified OTUs,
sented. Barcoding revealed a total of 38 OTUs in VA (498 Arthropoda were again the most OTU-rich (21.5% in VA and
individuals) and 64 OTUs in FL (655 individuals). In both loca- 29% in FL; Fig. 1). In FL, Arthropoda also had the highest
tions, the most abundant and species-rich higher taxon was the percentage of sequences (28.5%), but in VA, the largest number
Arthropoda (16 OTUs in 385 individuals in VA and 30 OTUs in of sequences belonged to Cnidaria (38.7%).
583 individuals in FL; Fig. 1). Samples from both locations ad- As is common in many samples of diversity, there were a few
ditionally contained representatives of the Annelida, Chordata, common taxa and many rare taxa in all samples. The percentage
and Mollusca; FL samples also contained Platyhelminthes and of singletons was surprisingly uniform, ranging from 31.6–46.9%
Echinodermata. Over half of the sequences at both locations in the barcoded samples and from 30.3–39.8% in the metabarcoded
matched reference sequences (>97% similarity) in the GenBank samples (Table 1). The groups with the highest percentage of sin-
sequence database (GenBank) or Barcode of Life Data Systems gletons were “unidentified” OTUs in both VA and FL (53.2% and
(BOLD) (Table 1 and Table S1). 36% of singletons, respectively), followed by Arthropoda (14.6%
In contrast, the sessile and two smaller sieved fractions had and 23.3% of singletons, respectively). The most common OTUs, as
much higher diversity at the OTU and phylum levels, and a much measured by either the number of sequences (Fig. 2) or the number
lower proportion of OTUs matching sequences in public data- of samples where they occurred (Fig. S2), were more likely to match
bases. In total, HTS detected 1,204 OTUs from 572,290 reference barcodes (>97% similarity).
Overall, total diversity was surprisingly high for such small
sequences and 1,391 OTUs from 409,613 sequences from VA
samples, and statistical analysis revealed that sampling was far
and FL, respectively (Table 1). Matches to GenBank or BOLD
from exhaustive. The mean (±SD) number of taxa per ARMS
sequences were low (<12%) and comparable at both locations.
was 439.3 ± 54.9 for VA (11.2 ± 4.1 barcoded and 434.2 ± 55.7
Several additional OTUs could be identified because they metabarcoded) and 545.8 ± 29.6 for FL (15.8 ± 4.9 barcoded and
matched reference barcodes obtained from the >2-mm and 2-mm 536.7 ± 30.8 metabarcoded) (Table 1). Chao1, Chao2, and rar-
to 500-μm fractions that were characterized morphologically efaction estimates for increasing numbers of ARMS (Fig. 3) or
(3.2% in VA and 3.9% in FL). Many more of the remaining sequences (Fig. S3) show numbers continuing to climb with in-
creased sampling effort.

VA FL Patterns of Community Composition. For the >2-mm organisms, FL


and VA were highly distinct, with just four of 98 OTUs in
common. Principal component analysis (PCoA) shows FL and
VA samples well separated in 2D space along an axis explaining
32.6% and 62.6% of the variation in community composition
using Jaccard (presence–absence) and Bray–Curtis (relative
abundance) indices, respectively (Fig. S4 A and C). VA and FL
samples also partition on an unweighted pair group method with
arithmetic mean (UPGMA) tree, with differences between
locations well supported by jackknife subsampling (Fig. S4 B and
D). Differences in community composition were also significant
π
between locations (Jaccard: F1,16 = 7.32, P < 0.001; Bray–Curtis:
π
F1,16 = 27.3, P < 0.001).
In contrast, the sessile and smaller sieved fractions from FL
and VA were more similar, with 457 (21%) of the 2,138 OTUs
detected via HTS shared. Nevertheless, VA and FL samples are
consistently separated on PCoA based on incidence or abun-
dance along a first axis explaining 21.0% and 30.4% of the var-
iation in OTU composition, respectively (Fig. 4 A and C), and
they partition into two distinct well-supported groups on a
UPGMA tree (Fig. 4 B and D). With all three fractions aggre-
gated, community composition was significantly different be-
π
tween ARMS from VA and FL (Jaccard: F1,16 = 13.21, P < 0.001;
π
Bray–Curtis: F1,16 = 26.72, P < 0.001) with 15 OTUs contributing
50% of the difference between VA and FL based on relative
abundance (Table S2).
ECOLOGY

The sessile and smaller sieved fractions had distinct OTU


composition in both VA and FL. Their separation is clearly seen
in PCoA and jackknifed UPGMA trees in both FL and VA (Fig.
4). Fractions were significantly different at both locations based on
π
either OTU presence–absence (VA: F2,24 = 6.56, P < 0.001; FL:
π π
F2,24 = 8.08, P < 0.001) or relative abundance (VA: F2,24 = 14.82,
π
P < 0.001; FL: F2,24 = 13.78, P < 0.001). Although samples of the
2-mm to 500-μm and 500- to 106-μm fractions overlay on the 2D
PCoA constructed using the Bray–Curtis metric (Fig. 4C), differ-
Fig. 1. OTU diversity and abundance in VA and FL. The category “Other ani- ences in OTU composition were significant at both locations (VA:
π π
mals” comprises Acoelomorpha, Chaetognatha, Ctenophora, Echiura, Ento- F1,15 = 13.05, P < 0.001; FL: F1,15 = 7.82, P < 0.001). As expected,
procta, Gastrotricha, Hemichordata, Nematoda, Nemertea, Platyhelminthes, based on the plate appearances (Fig. S1C), Porifera and Chordata
Rotifera, Sipuncula, Tardigrada, and Xenacoelomorpha. are primarily responsible for the distinctiveness of the sessile

Leray and Knowlton PNAS | February 17, 2015 | vol. 112 | no. 7 | 2077
Table 1. OTU diversity and abundance in VA and FL
VA FL

Barcoding Metabarcoding Barcoding Metabarcoding

2 mm to 500 to 2 mm to 500 to
Diversity descriptors >2 mm 500 μm 106 μm Sessile Total >2 mm 500 μm 106 μm Sessile Total

No. of sequences 498 256,147 97,439 218,704 572,290 655 155,232 86,350 168,031 409,613
Total no. of OTUs 38 651 828 436 1,204 64 821 976 591 1,391
Mean no. of OTUs 11.2 203.3 290.3 146.6 434.2 15.8 277.2 360.1 222.9 536.7
Mean rarefied no. of OTUs 8.2 117.1 229.7 85.7 333.5 9.6 202.6 312.1 157.4 484.4
Chao1 46.0 1,075.6 1,204.9 638.7 1,711.4 104.8 1,183.0 1,486.0 858.0 1,945.7
Chao1 (rarefied) 49.2 552.7 917.8 451.5 1,174.0 144.7 866.5 1,213.9 562.1 1,483.7
ACE 47.6 1,062.8 1,223.0 628.3 1,743.0 108.6 1,197.8 1,483.1 819.38 1,982.0
ACE (rarefied) 57.6 515.4 975.7 459.1 1,217.8 91.6 837.7 1,213.2 556.02 1,521.3
OTUs with match,* % 60.5 14.1 10.6 16.2 10.2 57.8 15.7 11.8 16.9 11.9
Unidentified OTUs, % NA 35.6 38.8 31.2 40.9 NA 26.8 27.1 23.8 28.3
Singletons, % 31.6 39.8 36.5 34.9 34.8 46.9 32.3 34.6 30.3 31.1

Additional data and calculation methods are provided in Table S1. NA, not applicable.
*Greater than 97% similarity to GenBank or BOLD sequences.

fraction in FL, whereas unidentified OTUs characterize the 500- (VA) and 77.1% (FL) of OTUs detected via barcoding of in-
to 106-μm fraction (Fig. S5). dividual specimens isolated from the 2-mm to 500-μm fractions
We found no evidence for fine-scale geographic structuring of matched OTUs in the metabarcoding dataset from the same lo-
the >2-mm samples based on PCoA (Fig. S4 A and C), UPGMA cation. The proportion of matches between barcoding and meta-
trees (Fig. S4 B and D), and permutational multivariate analysis barcoding datasets for individual ARMS ranged from 65.3–88.2%
π
of variance (PERMANOVA; Jaccard: F5,12 = 2.33, P = 0.360; (i.e., 71.4% match between barcodes and metabarcodes of ARMS
π
Bray–Curtis: F5,12 = 6.40, P = 0.397). In contrast, fine-scale 5 in FL). A majority (50–100%) of undetected OTUs were rep-
structuring in FL was apparent for all three fractions analyzed via resented by a single specimen, suggesting that they were rare or
metabarcoding on the jackknifed UPGMA tree (Fig. 4 B and D), perhaps absent from the subsample (half of the total) crushed for
with strong branch support for triplets representing adjacent DNA metabarcoding. Undetected OTUs belonged to several
ARMS. Although most adjacent sites in VA did not cluster as phyla, suggesting no particular taxonomic bias in OTU detection.
triplets on the UPGMA trees, differences between sites were Similarly, for sessile taxa, individual barcodes from subsamples of
π conspicuous taxa were always present in the overall metabarcoding
highly significant in VA (Jaccard: F2,24 = 1.07, P < 0.001; Bray–
π π
Curtis: F2,24 = 1.41, P = 0.002) as well as in FL (Jaccard: F2,24 = 1.77, dataset (n = 5 in VA and n = 13 in FL). The proportion of matches
P < 0.001; Bray–Curtis: F2,24π
= 2.11, P < 0.001). between barcoding and metabarcoding datasets for individual
The differences observed between locations, sites within ARMS was 100% in VA and ranged from 90.9–91.7% in FL.
locations, and fractions cannot be broadly attributed to differences The relative abundance of sequences also showed good
in multivariate dispersion, because tests were insignificant except agreement with independent measures of relative abundance. In
between fractions in FL (Jaccard: F2,24 π
= 14.14, P < 0.001; Bray– the 2-mm to 500-μm fractions, there was a significant positive
π
Curtis: F2,24 = 14.10, P < 0.001). Moreover, all differences in relationship between the amount of DNA per OTU and the
number of sequences per OTU for all ARMS from VA (ARMS
community composition remained significant after removing sin-
1: t22 = 3.95, P < 0.001; ARMS 4: t16 = 3.58, P < 0.001; ARMS 9:
gletons from the metabarcoding dataset.
t18 = 6.64, P < 0.001; Fig. 5A) and FL (ARMS 1: t41 = 6.49, P <
0.001; ARMS 5: t34 = 4.54, P < 0.001; ARMS 9: t48 = 8.91, P <
Validation of the Metabarcoding Approach. Comparing HTS data
0.001; Fig. 5B). At the phylum level (DNA and sequences per
against more direct assessments of community composition re-
OTU pooled by phylum), there were too few points to compute
vealed that HTS is effective at detecting OTUs. A total of 91.2% statistics for ARMS individually, but there was an overall sig-
nificant relationship between the amount of DNA and the
number of sequences in VA (t10 = 6.14, P < 0.001; Fig. 5C) and
OTUs in VA only OTUs in FL only OTUs at both localities FL (t20 = 3.72, P = 0.001; Fig. 5D). For sessile taxa in FL,
measures of abundance based on point counts were significantly
correlated with numbers of sequences in the metabarcoding
dataset at both the OTU (ARMS 1: t10 = 3.70, P = 0.005; ARMS
5: t10 = 5.26, P < 0.001; ARMS 9: t12 = 3.60, P = 0.005; Fig. 5F)
and phylum (t13 = 4.02, P = 0.002; Fig. 5H) levels. No statistics
were calculated for the sessile OTUs from VA because of the
limited number of data points (Fig. 5 E and G), but the pattern is
similar to the pattern exhibited by the FL samples.
Discussion
Our intensive survey of the marine diversity of a small area (a
total of 7.82 m2 and 0.05 m3 per locality) yielded a surprising
amount of diversity: 1,218 OTUs in VA and 1,421 OTUs in FL.
Although more than half of the barcode-based OTUs from
Fig. 2. Proportion of identified OTUs in a metabarcoding dataset according invertebrates and fish >2 mm matched barcodes in public li-
to the number of sequences. A match to the reference barcode was defined braries, only 10–12% (VA/FL) of the metabarcode-based OTUs
as >97% similarity. in the sessile and smaller sieved fractions matched GenBank or

2078 | www.pnas.org/cgi/doi/10.1073/pnas.1424997112 Leray and Knowlton


VA FL entire branches of the tree of life, COI barcode misidentifications
in GenBank, or limitations in using the hypervariable COI region
for phylogenetic assignments. Methodological artifacts (e.g., PCR
and sequencing errors or amplification of pseudogenes) are also
a possibility, but they likely account for a minor proportion of
unidentified OTUs, given our stringent quality filtering based on
amino acid translation.
Regardless of taxonomy, our data provide a unique opportu-
nity to investigate local and regional patterns of diversity across
size fractions of mobile and sessile taxa. As expected, the >2-mm
fraction is much less diverse than smaller mobile fractions (over
an order of magnitude in VA and FL), a difference partly in-
herent to the sequencing approach used. Although barcoding
provides specimen level data, metabarcoding captures not only
“free-living” forms but also parasites, gut contents, and other
forms of environmental DNA. We found the smallest sized
organisms (500 to 106 μm) to have a 1.96- to 1.54-fold greater
rarefied diversity than the 2-mm to 500-μm organisms in VA and
FL, respectively, and the sessile fraction had the lowest rarefied
OTU richness of the three metabarcoded fractions at both sites
(1.37- to 1.29-fold smaller than 2-mm to 500-μm organisms, re-
spectively). A higher local diversity in smaller sized organisms is
consistent with the literature (9). However, the small sieve (106
μm) is likely accumulating debris and body parts shedding from
sessile and larger motile animals (retained by coarser sieves)
during field processing, thereby “artificially” increasing diversity
in the 500- to 106-μm fraction [e.g., some sessile OTUs (i.e.,
Porifera, Bryozoa) are major contributors to differences between
sieved fractions in both VA and FL; Table S2].
Also, as expected [(10) and ubiquity hypothesis (11)], the larger
organisms showed a greater difference in estimated diversity be-
tween the temperate (37.6°N) and subtropical (27.4°N) locations
Fig. 3. Sample-based rarefaction curves and Chao1 estimates of total OTU (2.3-fold greater diversity in FL than VA) than did organisms be-
diversity. longing to the smaller sized fractions (1.1- to 1.4-fold greater in FL
than VA for the two smaller mobile fractions). (The sessile fraction
is composed of large and small organisms, and so cannot be sep-
BOLD barcodes. As a result, identification of OTUs detected via arated in this fashion.) The latitudinal inflation in diversity of the
metabarcoding relied mostly on phylogenetic assignments to >2-mm fraction is comparable to the latitudinal inflation in di-
taxonomic groups represented in GenBank, a database still lack- versity observed for Western Atlantic coastal fishes from these
ing COI references for numerous families of marine inverte- latitudes (a two- to threefold difference) and somewhat less than
brates. Moreover, numerous OTUs remained unidentified, which the latitudinal inflation in diversity observed for decapod crusta-
may be due to the scarcity of representative COI sequences for ceans (five- to sixfold difference) (figure 2 of ref. 12).

A B VA

FL

VA

FL
ECOLOGY

C D

Fig. 4. Clustering analyses [PCoA (A and C) and jackknifed


UPGMA trees (B and D)] depicting similarity in community
composition based on OTU incidence (Jaccard; A and B)
and relative abundance (Bray–Curtis; C and D). PC, principal
component.

Leray and Knowlton PNAS | February 17, 2015 | vol. 112 | no. 7 | 2079
A VA B FL E VA F FL

C D G H

Fig. 5. Relationship between the number of sequences per OTU (A, B, E, and F) or per phylum (C, D, G, and H) obtained via metabarcoding and the total
amount of DNA (2-mm to 500-μm samples; A–D) or plate coverage (sessile taxa; E–H).

Our data showed no evidence for fine-scale spatial structuring Materials and Methods
in larger animal communities but demonstrated community par- ARMS Deployment, Collection, and Sampling. ARMS were deployed subtidally
titioning at the 100-m scale for assemblages of sessile and mi- adjacent to natural oyster reefs in VA and FL for ∼6 mo (Fig. S1 and SI Text,
croscopic taxa. More limited postsettlement dispersal abilities section II). Upon retrieval, each plate was kept submerged in seawater. Plates
make these communities sensitive to local scale differences in were photographed on both sides, and representative sessile taxa were in-
environmental factors (13, 14). Moreover, because numerous dividually tissue-sampled for DNA barcoding. Sessile organisms growing on
microscopic animals may have specific associations with sessile the plates were then scraped into a tray and homogenized using a kitchen
taxa, spatial structuring in sessile assemblages may amplify dif- blender, and ∼45 g of tissue was preserved in DMSO buffer (SI Text, sections
II.D and III). Water holding ARMS was filtered through 2-mm, 500-μm, and
ferences between communities of small mobile taxa.
106-μm sieves. Mobile specimens retained by the 2-mm sieve were photo-
Finally, the assessment of the robustness of the metabarcoding
graphed alive, identified to the lowest taxon level possible based on mor-
approach targeting the COI gene suggests this method can be phology, and individually preserved in 95% ethanol (EtOH). The two smaller
used reliably to detect OTU presence–absence, and it provides sieved fractions were initially bulk-preserved in 95% EtOH, and the organic
useful information on OTU relative abundance as well. Notably, fraction was later separated from sediments by decantation (SI Text, section
the reliability improves as one moves to coarser groupings, be- IV). Each organic fraction was split in half by weight; the first half was crushed
cause we have shown a remarkable fidelity at the level of the using a mortar and pestle to be analyzed via DNA metabarcoding, and the
phylum, which suggests limited PCR bias among distant taxo- second half was archived in 95% EtOH (a summary of the protocol is provided
nomic groups. This finding is noteworthy because many ecological in Fig. S1D). The biomass and amount of sediment are provided in Table S3.
assessments work at the level of functional groups rather than
at the level of species. Alternatively, PCR-free shotgun meta- DNA Barcoding and Metabarcoding. For barcoding, tissue was subsampled
genomic approaches will be less prone to bias but require much from each specimen and placed individually in 96-well Costar plates (Corning)
higher sequencing depth (15), therefore increasing the cost for for phenol DNA extraction. DNA amplification and Sanger sequencing used
sufficient replication. Taxonomic coverage among animals will standard protocols (SI Text, section V.A) and previously published primers
also increase with sequencing multiple independent markers (19, 20). For metabarcoding, DNA was extracted from 10 g of homogenized
sessile tissue and the crushed half of the 2-mm to 500-μm and 500- to 106-μm
[i.e., 18S, 16S (16)], but targeting nonprotein-coding genes may
samples using the MO-BIO Powermax Soil DNA Isolation Kit (SI Text, section
increase the probability of including sequencing artifacts. More- V.B). Three replicate PCR assays were performed to amplify an ∼313-bp COI
over, alternative barcode genes would provide a more compre- fragment for each of the 54 bulk samples (SI Text, section V.B). We used
hensive survey of fungi [i.e., internal transcribed spacer (17)] and a hierarchical tagging approach for sample multiplexing [combination of
protists [i.e., 18S (18)]. As marine monitoring moves from the tailed PCR primers and Ion Xpress (Life Technologies) barcode adapters;
use of indicator groups to more comprehensive community-level SI Text, section V.B and Table S4]. Amplicons were sequenced on the Ion
analysis of alpha and beta diversity, this study provides support to Torrent platform (Life Technologies) following the manufacturer’s instructions
encourage more routine use of a metabarcoding approach. (SI Text, section V.B). Barcode sequences were deposited in GenBank (accession

2080 | www.pnas.org/cgi/doi/10.1073/pnas.1424997112 Leray and Knowlton


nos. KP253982–KP255345) and BOLD (doi: dx.doi.org/10.5883/DS-ARMS), and constrained permutations within factors when necessary (i.e., within locality
the metabarcode datasets were deposited in the Dryad Digital Repository (doi: to test for differences among sites). Metabarcoding data were initially ag-
doi.org/10.5061/dryad.d0r79). gregated per ARMS to test for overall differences in OTU composition and
dispersion between locations and sites. We then partitioned the OTU table
Sequence Analysis of Barcodes and Metabarcodes. For barcodes, forward and between locations to test for differences between fractions and sites. We
reverse sequences were assembled, checked for stop codons or frame shifts, repeated all statistical analysis after removing singletons from the meta-
and edited in Geneious (Biomatters). We used the Bayesian clustering al- barcoding OTU table to test for the robustness of the patterns. Finally, we
gorithm implemented in clustering 16S rRNA for OTU prediction (CROP) conducted a similarity of percentage analysis to determine which OTUs were
(21) to delineate OTUs based on the natural distribution of sequence major contributors to differences between locations and fractions based on
dissimilarity in the dataset (SI Text, section VI.A). CROP outputs a repre- relative abundance. All tests were computed in the R package Vegan (29)
sentative sequence per OTU that was used for taxonomic identification. and significance-tested using 1,000 permutations.
For metabarcodes, higher quality reads prefiltered by Torrent Suite Soft-
ware version 4.0.2 (Life Technologies) were assigned to samples based on Assessment of Reliability of Metabarcoding Approach. One ARMS from each of
the combination primer tail-Ion Xpress barcode. Additional sequences the three sites at each location was randomly chosen. Archived bulk samples
were removed if they did not meet several criteria (SI Text, section VI.B). were resuspended in a graduated beaker containing 100 mL of 95% EtOH
We then took advantage of the coding property of the COI gene to im- and homogenized with a spatula, and 20 mL was immediately collected
prove the quality and reliability of our dataset further by discarding reads using a Hensel–Stempel pipette. All specimens were isolated and identified
with any anomaly in their amino acid translation using Multiple Alignment to the lowest taxonomic level; the entire specimen was used for phenol
of Coding Sequences (MACSE) (22) (SI Text, section VI.B). Finally, reads were DNA extraction, and the mitochondrial COI gene was sequenced for OTU
screened for chimeras using UCHIME (23). Remaining reads were clustered in delineation. The amount of DNA in each individual extract was measured
OTUs using CROP following the parameters detailed in SI Text, section VI.A. with a Qubit fluorometer (dsDNA HS Assay kit; Invitrogen), enabling the
For taxonomic assignments, we performed BLASTn searches of OTU repre- calculation of the total amount of DNA represented by each OTU and
sentative sequences of the barcoding and metabarcoding datasets against full subsequent comparison with the number of reads obtained for the same
GenBank and BOLD databases. We accepted a species level match when sim-
OTU via metabarcoding (SI Text, section VII.A). For the sessile organisms, we
ilarity to the reference barcode was >97%. In the absence of a direct match,
individually sampled and barcoded morphologically distinctive taxa to
we used a phylogenetic approach implemented in the Statistical Assignment
identify matching OTUs in the metabarcoding dataset. The number of reads
Package (24) to assign a higher taxonomic level (SI Text, section VI.C).
per OTU was then compared with their estimated cover on each ARMS as
measured by a point count approach implemented in Coral Point Count
Statistical Analyses. We summarized barcoding and metabarcoding data in with Excel extensions (CPCe) (30) (SI Text, section VII.B).
separate OTU tables. Sample-based rarefaction curves and nonparametric
The relationship between the number of reads in metabarcoding dataset
species richness estimators were computed in EstimateS (25). Each OTU table
and the amount of DNA or number of point counts (for 2-mm to 500-μm and
was rarefied to the lowest number of sequences using Quantitative Insights
sessile specimens, respectively) was tested using a generalized linear model.
into Microbial Ecology (QIIME) (26) to account for differences in abundance.
Given the nature and significant overdispersion of the data, we fitted a quasi-
We used Jaccard (presence–absence) and Bray–Curtis (relative abundance)
Poisson model with log link function.
metrics to calculate pairwise community distance matrices and examine
differences in beta diversity. Patterns of sample dissimilarity were visualized
using PCoA. Hierarchical cluster trees were also constructed using UPGMA ACKNOWLEDGMENTS. We thank Natalia Agudelo, Sherry Reed, Woody Lee,
and Sean Fate for help in the field; the Smithsonian Laboratory of Analytical
with jackknife support to examine the robustness of sample clustering. We
Biology staff for assistance; and Daryl Hurley II for allowing research to be con-
examined differences in community composition between locations and sites ducted on his oyster reef in VA. Research Permit SAJ-2012-02893(NW-SLR) was
using PERMANOVA (27). We also tested whether the average distance to provided by the US Army Corps of Engineers in FL. Financial support was pro-
the group centroid is equivalent among groups [multivariate dispersion: vided by the Sant Chair and Smithsonian Tennenbaum Marine Observatories
PERMDISP (28)]. To account for the stratified structure of the design, we Network, for which this paper is Contribution 1.

1. Bouchet P, Lozouet P, Maestrati P, Heros V (2002) Assessing the magnitude of species 17. Schoch CL, et al. (2012) Nuclear ribosomal internal transcribed spacer (ITS) region as
richness in tropical marine environments: Exceptionally high numbers of molluscs at a universal DNA barcode marker for Fungi. Proc Natl Acad Sci USA 109(16):6241–6246.
a New Caledonia site. Biol J Linn Soc Lond 75(4):421–436. 18. Pawlowski J, et al. (2012) CBOL protist working group: Barcoding eukaryotic richness
2. Appeltans W, et al. (2012) The magnitude of global marine species diversity. Curr Biol beyond the animal, plant, and fungal kingdoms. PLoS Biol 10(11):e1001419.
22(23):2189–2202. 19. Geller J, Meyer C, Parker M, Hawk H (2013) Redesign of PCR primers for mitochondrial
3. Mora C, Tittensor DP, Adl S, Simpson AGB, Worm B (2011) How many species are there cytochrome c oxidase subunit I for marine invertebrates and application in all-taxa
on Earth and in the ocean? PLoS Biol 9(8):e1001127.
biotic surveys. Mol Ecol Resour 13(5):851–861.
4. Tittensor DP, et al. (2010) Global patterns and predictors of marine biodiversity across
20. Leray M, et al. (2013) A new versatile primer set targeting a short fragment of the
taxa. Nature 466(7310):1098–1101.
mitochondrial COI region for metabarcoding metazoan diversity: Application for
5. Bourlat SJ, et al. (2013) Genomics in marine monitoring: New opportunities for as-
sessing marine health status. Mar Pollut Bull 74(1):19–31. characterizing coral reef fish gut contents. Front Zool 10:34.
6. Fonseca VG, et al. (2014) Metagenetic analysis of patterns of distribution and diversity 21. Hao X, Jiang R, Chen T (2011) Clustering 16S rRNA for OTU prediction: A method of
of marine meiobenthic eukaryotes. Glob Ecol Biogeogr 23(11):1293–1302. unsupervised Bayesian clustering. Bioinformatics 27(5):611–618.
7. Plaisance L, Caley MJ, Brainard RE, Knowlton N (2011) The diversity of coral reefs: 22. Ranwez V, Harispe S, Delsuc F, Douzery EJP (2011) MACSE: Multiple Alignment of
What are we missing? PLoS ONE 6(10):e25026. Coding SEquences accounting for frameshifts and stop codons. PLoS ONE 6(9):e22594.
8. Beck MW, et al. (2011) Oyster reefs at risk and recommendations for conservation, 23. Edgar RC, Haas BJ, Clemente JC, Quince C, Knight R (2011) UCHIME improves sensi-
restoration, and management. Bioscience 61(2):107–116. tivity and speed of chimera detection. Bioinformatics 27(16):2194–2200.
9. Azovsky A (2002) Size-dependent species-area relationships in benthos: Is the world 24. Munch K, Boomsma W, Huelsenbeck JP, Willerslev E, Nielsen R (2008) Statistical as-
more diverse for microbes? Ecography 25(3):273–282. signment of DNA sequences using Bayesian phylogenetics. Syst Biol 57(5):750–757.
10. Hillebrand H (2004) On the generality of the latitudinal diversity gradient. Am Nat 25. Colwell RK (2006) EstimateS: Statistical Estimation of Species Richness and Shared
ECOLOGY

163(2):192–211. Species from Samples. Version 9.1. Available at purl.oclc.org/estimates. Accessed


11. Fenchel T, Finlay BJ (2004) The ubiquity of small species: Patterns of local and global August 15, 2014.
diversity. Bioscience 54(8):777–784. 26. Caporaso JG, et al. (2010) QIIME allows analysis of high-throughput community se-
12. Macpherson E (2002) Large-scale species-richness gradients in the Atlantic Ocean. Proc
quencing data. Nat Methods 7(5):335–336.
Biol Sci 269(1501):1715–1720.
27. Anderson MJ (2001) A new method for non-parametric multivariate analysis of var-
13. Green JL, et al. (2004) Spatial scaling of microbial eukaryote diversity. Nature
iance. Austral Ecol 26:32–46.
432(7018):747–750.
28. Anderson MJ, Ellingsen KE, McArdle BH (2006) Multivariate dispersion as a measure
14. Curini-Galletti M, et al. (2012) Patterns of diversity in soft-bodied meiofauna: Dis-
persal ability and body size matter. PLoS ONE 7(3):e33801. of beta diversity. Ecol Lett 9(6):683–693.
15. Zhou X, et al. (2013) Ultra-deep sequencing enables high-fidelity recovery of bio- 29. Oksanen J, et al. (2009) Vegan: Community ecology package. R package version
diversity for bulk arthropod samples without PCR amplification. Gigascience 2(1):4. 2.0-10. Available at cran.r-project.org/package=vegan. Accessed August 15, 2014.
16. Gibson J, et al. (2014) Simultaneous assessment of the macrobiome and microbiome 30. Kohler KE, Gill SM (2006) Coral Point Count with Excel extensions (CPCe): A Visual
in a bulk sample of tropical arthropods through DNA metasystematics. Proc Natl Acad Basic program for the determination of coral and substrate coverage using random
Sci USA 111(22):8007–8012. point count methodology. Comput Geosci 32(9):1259–1269.

Leray and Knowlton PNAS | February 17, 2015 | vol. 112 | no. 7 | 2081
Supporting Information
Leray and Knowlton 10.1073/pnas.1424997112
SI Text DMSO buffer [0.25 M EDTA (pH 7.5), DMSO, NaCl-saturated]
for DNA barcoding. The surface and sides of all PVC plates were
I. ARMS Specifications then scraped into a tray of seawater or EtOH, and the total content
ARMS consist of ten 22.5 × 22.5-cm PVC plates separated by was poured into a kitchen blender with 45-μm filtered seawater
1.27-cm spacers, anchored to a baseplate (Fig. S1). In alternate (roughly 1:1 in volume) for homogenization for 30 s at maximum
layers, water flow through the spaces was obstructed by bars speed. Blended material was then immediately poured into
running from the corners to the center of the plate. The total a collection net (45-μm Nitex mesh) and rinsed with seawater or
surface area sampled was 0.869 m2 per ARMS, and the total 95% EtOH, squeezing out the liquid through the mesh at least
volume between plates was 0.005 m3 per ARMS. twice. On-site homogenization followed by the washing step was
found to give high-molecular-weight DNA as detailed below.
II. Field Sampling Protocols After the last wash and squeezing out of liquid, ∼15 g of material
A. Deployment and Recovery. ARMS were deployed subtidally was placed inside each of three falcon tubes that were then filled
adjacent to natural oyster reefs on September 19, 2013 in VA and with DMSO buffer. Falcon tubes were placed at −20 °C, along
on November 6, 2013, in FL. We used stainless-steel stakes to with any remaining tissue that was frozen, in plastic bags.
anchor the structures to the reef at a level guaranteeing that they
remained submerged at low tide. ARMS were collected on May E. Avoiding Contamination. Because the PCR-based approach to
3–4, 2014, and May 26–29, 2014, in VA and FL, respectively, for characterize communities is very sensitive to contamination, each
a soak time of ∼6 mo. To prevent loss of community members, piece of equipment was soaked in 10% bleach (sodium hypo-
a 100-μm Nitex-lined crate was placed over the ARMS structure chlorite) for a minimum of 5 min before first use and between
and fastened with two or three hooked elastic cords (bungies) samples for sterilization. Nitrile gloves were used to manipulate
before removal of the ARMS from the bottom. Lined crates are equipment at all times.
designed to cover the central structure made of 10 PVC plates
III. Tests of DNA Preservation of Sessile Fraction
only, which means that mobile specimens occurring on the base
plate are able to escape during ARMS handling in the field. Preliminary tests were conducted to determine the best approach
Stakes used for anchoring ARMS to the substrate were then to obtain high-molecular-weight DNA from the sessile fraction.
removed, and a small cable tie was placed at the northern corner An initial protocol was designed in which plates were first sub-
of the baseplate to keep track of orientation. ARMS were then merged in 95% EtOH for several hours to reduce the amount of
placed in a large plastic container with seawater and at least two water in animal tissues. Then, tissues were scraped into a large
aeration stones and transported to the wet laboratory. container filled with EtOH (ratio of ∼1:10) and preserved at
−20 °C for several days or weeks. Finally, tissues were homog-
B. Disassembly. The lined crate was removed, rinsed over the enized (using a blender) in a small amount of EtOH, and ∼10 g
plastic container in which the structure was transported with of material was immediately collected for DNA extraction in the
seawater from the plastic container, and examined for any hiding laboratory. That approach provided very low DNA quality
organisms. The ARMS were positioned upside-down to unscrew [100% of DNA fragments shorter than 300 bp as measured
nuts and bolts at each corner (long bolts are left in place to allow by a TapeStation (Agilent Technologies)] for 90% of samples
the removal of each plate one by one). The baseplate was re- tested. After ruling out the potential effect of mechanical
moved first, brushed minimally inside the plastic container to shearing during tissue homogenization, the protocol was modi-
remove any mobile animals, and placed aside (not analyzed). fied to minimize storage time by conducting tissue homogeni-
Each of the 10 plates was then removed one by one and lightly zation and DNA extraction in the field. Nevertheless, DNA was
brushed, a small cable tie was placed at the northern corner of the still degraded, which suggested that chemical denaturation po-
plate, and the plate was photographed on both sides and finally tentially caused by substances released by sessile animals oc-
placed in labeled (with ARMS and plate number) 5-gallon curred quickly following tissue homogenization. We were able to
buckets containing 45-μm filtered seawater and an air stone. obtain very high-quality DNA (75% of DNA fragments longer
than 10 kb as measured by the Agilent TapeStation) across all
C. Processing Sieved Fractions. Water from the large plastic con- samples by homogenizing samples shortly after scraping (plates
tainer was filtered through three sets of sieves [2 mm (no. 10), are kept in aerated seawater) and immediately rinsing the ho-
500 μm (no. 32), and 106 μm (no. 140)], and each fraction was mogenate in a 45-μm mesh collection net using seawater or
placed in an individual tray with an air stone. Mobile specimens EtOH. We used DMSO buffer for tissue preservation because it
retained by the 2-mm sieve were sorted to morphospecies; photo- was shown to be more effective than EtOH for preserving DNA
graphed alive to document color patterns; and anesthetized using of several sessile taxonomic groups (1).
clove oil, magnesium chloride, or chilling before preservation in
95% EtOH. Pieces of algae and other sessile organisms retained by IV. Decantation of Sieved Fractions
the 2-mm sieve were not processed. The two smaller sieved frac- Small sieved fractions (2 mm to 500 μm and 500 to 106 μm)
tions (2 mm to 500 μm and 500 to 106 μm) were washed with contain sediments that should be separated from the organic
seawater into a 45-μm Nitex net and preserved in falcon tubes (or fraction before DNA extraction. Each sample was therefore
larger jars depending on the volume of the sample) containing 95% transferred into a 2-L cylinder and filled up to the 1.5-L level
EtOH. Both individual mobile animals larger than 2 mm and with deionized water. The cylinder was sealed with parafilm and
smaller sieved fractions were kept at −20 °C until DNA extraction. shaken vigorously to resuspend animals and other organic mat-
ter, and the water was poured quickly into a 45-μm sieve. Sample
D. Processing the Sessile Fraction. The most common and conspic- resuspension was repeated five times (or until no organic par-
uous sessile taxa found on the plates were photographed, and ticulates could be observed after shaking). The material retained
a small tissue sample was preserved in salt-saturated 25% (vol/vol) by the sieve was weighed and homogenized with a spatula. Half

Leray and Knowlton www.pnas.org/cgi/content/short/1424997112 1 of 11


of the sample was then crushed using a mortar and pestle for denaturation for 10 s at 95 °C, annealing for 30 s at 62 °C (−1 °C
2 min and preserved in a falcon tube with 95% EtOH for DNA per cycle), and extension for 60 s at 72 °C, followed by 20 cycles at
extraction. The other half was archived (in 95% EtOH) or used an annealing temperature of 46 °C. Triplicate PCR products were
for morphological analysis, as in the present study. Sediments pooled and purified using Agencourt AMPure XP beads, and
collected in the bottom of the cylinder were also archived. All equimolar amounts of each sample were pooled, with each pool
equipment used for decantation was bleached and UV-sterilized containing amplicons generated with each of the seven tailed-
between samples. primer pairs (total of eight pools). End-repair (Ion Plus Fragment
We examined the 2-mm to 500-μm sediments from VA and FL Library kit) and ligation of Ion Xpress barcode adapters were
to quantify the abundance and diversity of mollusks and other
conducted following the manufacturer’s instructions (Life Tech-
organisms. There was a mean (±SD) of 4.1 (±4.3) and 5 (±4.7)
nologies). Library templates were clonally amplified using the
specimens in the 2-mm to 500-μm sediments from VA and FL,
respectively, with a majority being mollusks [2 (±2.3) and 3.7 OT2 400-bp kit on the Ion One Touch 2, and enriched template
(±3.7) specimens per ARMS, respectively]. We also found a few ISPs were sequenced on the Ion Torrent platform using the Ion
amphipods and isopods. We obtained COI sequences from 18 PGM 400-bp version 2 protocol (all from Life Technologies).
specimens found in sediments from VA. They belonged to nine
VI. Data Analysis
OTUS, all of them (including two mollusk OTUs) matching
reference OTUs in the metabarcoding dataset, which shows the A. DNA Barcoding. Forward and reverse sequences were assembled,
effectiveness of the decantation process in retaining organisms checked for stop codons or frame shifts, and edited in Geneious
for metabarcoding. (Biomatters). Our dataset comprised a diversity of taxonomic
groups so that using a fixed sequence dissimilarity cutoff (i.e., 5%)
V. Laboratory Protocols for clustering OTUs would not result in accurate species delin-
A. DNA Barcoding. A small piece of tissue was collected from each eations. Therefore, we used the Bayesian clustering algorithm
specimen retained by the 2-mm sieve and placed individually in implemented in CROP (4) to delineate OTUs based on the
96-well Costar plates (Corning) for phenol DNA extraction natural distribution of sequence dissimilarity in the dataset.
performed on an AutoGeneprep 965 (Autogen). DNA from Lower and upper bound variance was set to 3 and 4, respectively,
sessile taxa and whole specimens from the 500-μm to 2-mm because these settings were shown to provide the best results for
fractions were also extracted using the same procedure. Eluted marine invertebrates (3). CROP outputs a representative se-
DNA was used for PCR amplification of a ∼658-bp fragment of quence per OTU that was used for taxonomic identification.
the mitochondrial COI gene using the following PCR mixture:
19-μL reaction with 10 μL of Promega GoTaq G2 Hot Start B. DNA Metabarcoding. Higher quality reads prefiltered by Torrent
Master Mix, 0.6 μL of 10 μM each forward or reverse primer Suite Software version 4.0.2 (Life Technologies) were assigned to
[jgLCO/jgHCO (2)] and 0.2 μL of 20 mg/mL BSA. PCR thermal samples based on the combination primer tail-Ion Xpress barcode.
cycling conditions were as follows: 5 min at 95 °C; four cycles of Additional sequences were removed from the prefiltered dataset if
30 s at 94 °C, 45 s at 50 °C, and 60 s at 72 °C; 34 cycles at 45 °C they (i) were shorter than 250 bp, (ii) had more than two mis-
annealing temperature; and a final extension of 8 min at 72 °C. matches in the primer sequence, (iii) had any ambiguous base call,
PCR product was purified using ExoSAP-IT (Affymetrix), and
or (iv) had at least one homopolymer region longer than 8 bp. We
sequences were generated in both directions with the Sanger
then used the option “enrichAlignment” in Multiple Alignment of
sequencing platform. We then repeated the PCR assay with the
mlCOIintF/jgHCO primer combination (3) whenever the initial Coding Sequences (MACSE) (5) to align our reads to the high-
reaction was not successful (∼8% of samples) to increase our quality library of COI barcodes of the Moorea Biocode project
success rate. (7,675 sequences from 30 animal phyla represented), an all-taxa
biodiversity inventory of the Moorea Island ecosystem (6), re-
B. DNA Metabarcoding. taining sequences that had zero stop codons (using invertebrate
DNA extractions. DNA was extracted from 10 g of homogenized mitochondrial translation table), zero frame shifts, zero insertions,
sessile tissue, and the crushed half of the 2-mm to 500-μm and and no more than three deletions. This latter step maximizes the
500- to 106-μm samples using the MO-BIO Powermax Soil DNA reliability of the sequence dataset.
Isolation Kit. The initial bead-beating step of the kit was found
to shear DNA. Therefore, we added proteinase K (0.4 mg/mL) C. Taxonomic Assignments. We performed BLASTn searches (7)
to the powerbead solution (+ C1 solution) instead and incubated of OTU representative sequences of the barcoding and meta-
samples in a shaking incubator overnight at 56 °C, which ensured barcoding datasets in GenBank and BOLD. Previous papers that
effective tissue lysis. We followed the manufacturer’s instructions used similar DNA sequencing approaches to look at terrestrial
for the rest of the protocol. However, for sessile samples, we only (8) and marine diversity (3, 9) used 98% assignments. However,
used one-third of the homogenized tissue lysate for extraction to the distribution of sequence similarity in our dataset showed
prevent clogging of the silica membrane of the spin column. a significant number of matches between 97% and 98% before
Extracted DNA was purified using the MO-BIO Powerclean rapidly dropping below 90%, a pattern that may be driven by
DNA Clean-Up Kit and quantified with a Qubit fluorometer matches to specimens collected across the Atlantic Ocean.
(dsDNA HS Assay kit; Invitrogen) before PCR amplification.
Therefore, we accepted species level matches when similarity to
PCR amplification, tagging, and sequencing. We used a hierarchical
the reference barcode was higher than 97%.
tagging approach (as in ref. 3) combining seven tailed PCR
In the absence of a direct match we used a phylogenetic approach
primers (Table S4) and eight Ion Xpress barcode adapters (Life
Technologies) for sample multiplexing. Three replicate PCR implemented in the Statistical Assignment Package (10) to assign
assays were performed to amplify an ∼313-bp COI fragment for OTUs to higher taxonomic levels. The program was set to down-
each of the 54 bulk samples using the following PCR mixture: load up to 40 homologs from GenBank with ≥70% sequence
20-μL reaction with 1 μL of 10 μM each forward or reverse primer identity. We accepted taxonomic assignments at an 80% posterior
(tailed-mlCOIintF/tailed-jgHCO; Table S4), 1.4 μL of 10 mM probability cutoff, but we did not consider assignment lower than
dNTP, 0.4 μL of Clontech Advantage 2 Polymerase Mix, 2 μL of order level to minimize misidentifications due to the lack of data
Clontech Advantage 2 PCR buffer, and 1 μL (10 ng) of purified for some taxonomic groups in GenBank. OTUs that matched or
DNA. We used the touchdown PCR profile with 16 initial cycles: were assigned to bacteria were removed.

Leray and Knowlton www.pnas.org/cgi/content/short/1424997112 2 of 11


VII. Assessment of Reliability of Metabarcoding Approach number of reads per OTU in the metabarcoding dataset. We
A. Fraction Sized 2 mm to 500 μm. One ARMS from each of the three pooled amounts of DNA and read number for OTUs belonging to
sites at each location was randomly chosen for analysis. Archived the same phylum to test for the same relationship at the level of
bulk samples (see SI Text, section IV), which correspond to half of functional groups.
the sample measured by weight, were resuspended in a graduated We sorted and photographed a total of 251 and 954 animals in
beaker containing 100 mL of 95% EtOH and homogenized with three 2-mm to 500-μm fractions from VA and FL, respectively
a spatula, and 20 mL was immediately collected using a Hensel– (representing one ARMS from each of the three sites at the two
Stempel pipette. All specimens were isolated and identified to the locations). A total of 671 specimens were individually barcoded,
lowest taxonomic level using morphology, the entire specimen was which includes all specimens except Tanaidacea and Ostracoda
used for phenol DNA extraction according to the protocol de- from FL, for which only ∼25% of specimens were individually
scribed in SI Text (section V.A), and the mitochondrial COI gene barcoded because of their high abundance. Based on the subset
was sequenced for OTU delineation. of specimens analyzed, abundant Ostracoda and Tanaidacea
The amount of DNA in each individual extract was measured belonged to one and three OTUs, respectively. We extrapolated
with a Qubit fluorometer. The total amount of DNA represented
the amount of DNA for each of these four OTUs for subsequent
by each OTU was calculated by summing the amount of DNA of
analysis.
each specimen belonging to that same OTU according to COI
barcodes. To identify OTUs shared between datasets, we ran local B. Sessile Fraction. We individually subsampled and barcoded mor-
BLAST searches [using Geneious (Biomatters)] of one repre- phologically distinctive sessile taxa to identify matching OTUs in the
sentative sequence per OTU obtained via barcoding against the
metabarcoding dataset using local BLASTn searches (as discussed
full database of OTU representative sequences obtained via
above). The number of reads per OTU was then compared with the
metabarcoding.
To evaluate the efficacy of metabarcoding at detecting di- estimated cover of each OTU on each ARMS as measured by
versity, we first calculated the overall proportion of OTUs shared a point count approach implemented in Coral Point Count with
by the barcoding and metabarcoding datasets at each location. Excel extensions (CPCe) (11). A 15 × 15 grid was positioned over
We then conducted similar calculations but between datasets of each plate photograph, and the taxon located under each in-
corresponding ARMS. For example, we compared the proportion tersection of the grid was recorded. All 10 plates (19 sides) were
of shared OTUs between the barcoding and metabarcoding scored for each ARMS.
datasets of ARMS 1 from FL. To evaluate the efficacy of meta- The efficacy of metabarcoding at detecting diversity and relative
barcoding at estimating OTU relative abundance, we first tested abundance of sessile taxa was evaluated using similar calculations
the relationship between the amount of DNA per OTU and as presented in the previous section (SI Text, section VII.A).

1. Gaither MR, Szabó Z, Crepeau MW, Bird CE, Toonen RJ (2010) Preservation of corals 6. Leray M, Boehm JT, Mills SC, Meyer CP (2012) Moorea BIOCODE barcode library as
in salt-saturated DMSO buffer is superior to ethanol for PCR experiments. Coral a tool for understanding predator–prey interactions: Insights into the diet of common
Reefs 30(2):329–333. predatory coral reef fishes. Coral Reefs 31(2):383–388.
2. Geller J, Meyer C, Parker M, Hawk H (2013) Redesign of PCR primers for mitochondrial 7. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment
cytochrome c oxidase subunit I for marine invertebrates and application in all-taxa search tool. J Mol Biol 215(3):403–410.
biotic surveys. Mol Ecol Resour 13(5):851–861. 8. Yu DW, et al. (2012) Biodiversity soup: Metabarcoding of arthropods for rapid bio-
3. Leray M, et al. (2013) A new versatile primer set targeting a short fragment of the diversity assessment and biomonitoring. Methods Ecol Evol 3(4):613–623.
mitochondrial COI region for metabarcoding metazoan diversity: Application for 9. Machida RJ, Hashiguchi Y, Nishida M, Nishida S (2009) Zooplankton diversity analysis
characterizing coral reef fish gut contents. Front Zool 10:34. through single-gene sequencing of a community sample. BMC Genomics 10:438.
4. Hao X, Jiang R, Chen T (2011) Clustering 16S rRNA for OTU prediction: A method of 10. Munch K, Boomsma W, Huelsenbeck JP, Willerslev E, Nielsen R (2008) Statistical as-
unsupervised Bayesian clustering. Bioinformatics 27(5):611–618. signment of DNA sequences using Bayesian phylogenetics. Syst Biol 57(5):750–757.
5. Ranwez V, Harispe S, Delsuc F, Douzery EJP (2011) MACSE: Multiple Alignment of 11. Kohler KE, Gill SM (2006) Coral Point Count with Excel extensions (CPCe): A Visual
Coding SEquences accounting for frameshifts and stop codons. PLoS ONE 6(9): Basic program for the determination of coral and substrate coverage using random
e22594. point count methodology. Comput Geosci 32(9):1259–1269.

Leray and Knowlton www.pnas.org/cgi/content/short/1424997112 3 of 11


VA FL
VA

FL

VA

Fig. S1. Illustration of study design and diversity encountered. (A) Map of experimental design. (B) Photographs of Virginia location, ARMS, and ARMS re-
covery. (C) Photographs of representative ARMS plates and organisms in the 2-mm to 500-μm fraction. (D) Sample processing workflow. In C, scale bars are
provided for individual organisms, and the square plates are 22.5 cm on each side.

Leray and Knowlton www.pnas.org/cgi/content/short/1424997112 4 of 11


OTUs in VA only OTUs in FL only OTUs at both localities

Fig. S2. Proportion of identified OTUs in the metabarcoding dataset according to the number of ARMS where they were detected. (A) Virginia only.
(B) Florida only. (C) Both localities. OTUs were considered to match a reference barcode if they had >97% similarity to a COI sequence in the BOLD or GenBank
or in a reference barcode generated in this study.

VA

FL

VA

FL

Fig. S3. Individual-based rarefaction curves.

Leray and Knowlton www.pnas.org/cgi/content/short/1424997112 5 of 11


VA

FL

Fig. S4. Clustering analyses [PCoA (A and C) and UPGMA trees (B and D)] depicting similarity in community composition among >2-mm samples based on OTU
incidence (Jaccard; A and B) and relative abundance (Bray–Curtis; C and D). PC, principal component.

Leray and Knowlton www.pnas.org/cgi/content/short/1424997112 6 of 11


VA

FL

VA

FL

FL

VA

FL

VA

FL
VA

VA

FL

Fig. S5. PCoA with coordinates of the 10 most abundant phyla. The size of the sphere is proportional to the mean relative abundance of the taxon across samples.

Leray and Knowlton www.pnas.org/cgi/content/short/1424997112 7 of 11


Table S1. OTU diversity and abundance as revealed by DNA barcoding and DNA metabarcoding in ARMS from VA and FL
VA FL

Barcoding Metabarcoding Barcoding Metabarcoding

2 mm to 500 to 2 mm to 500 to
Diversity descriptors >2 mm 500 μm 106 μm Sessile Total >2 mm 500 μm 106 μm Sessile Total

No. of sequences 498 256,147 97,439 218,704 572,290 655 155,232 86,350 168,031 409,613

Leray and Knowlton www.pnas.org/cgi/content/short/1424997112


Total no. of OTUs 38 651 828 436 1,204 64 821 976 591 1,391
Mean (±SD) no. of OTUs 11.2 ± 4.1 203.3 ± 52.3 290.3 ± 32.1 146.6 ± 28.9 434.2 ± 55.7 15.8 ± 4.9 277.2 ± 37.6 360.1 ± 28.5 222.9 ± 24.3 536.7 ± 30.8
Mean (±SD) rarefied 8.2 ± 2.1 117.1 ± 21.4 229.7 ± 20.5 85.7 ± 15.1 333.5 ± 34.9 9.6 ± 2.2 202.6 ± 16.4 312.1 ± 23.2 157.4 ± 15.5 484.4 ± 30.7
no. of OTUs
Chao1 [95% CI] 46.0 [40.1, 1,075.6 [953.3, 1,204.9 [1,106.4, 638.7 [568.1, 1,711.4 [1,596.5, 104.8 [80.2, 1,183.0 [1,082.1, 1,486.0 [1,356.2, 858.0 [769.8, 1,945.7 [1,821.2,
67.8] 1,247.2] 1,338.2] 746.9] 1,859.9] 166.9] 1,322.8] 1,660.2] 989.7] 2,106.2]
Chao1 [95% CI] 49.2 [39.4, 552.7 [469.3, 917.8 [843.7, 451.5 [380.0, 1,174.0 [1,082.3, 144.7 [70.7, 866.5 [772.2, 1,213.9 [1,108.9, 562.1 [505.6, 1,483.7 [1,384.1,
(rarefied) 80.8] 690.4] 1,021.6] 569.8] 1,298.7] 403.5] 1,007.4] 1,358.7] 653.5] 1,617.3]
Chao2 [95% CI] 50.5 [41.7, 1,126.2 [1,001.7, 1,309.5 [1,191.1, 706.7 [621.0, 1,891.0 [1,746.4, 140.06 [94.3, 1,213.3 [1,117.7, 1,536.0 [1,406.2, 880.2 [796.2, 2,056.3 [1,921.3,
80.3] 1,295.0] 1,466.5] 832.0] 2,074.1] 254.8] 1,339.8] 1,705.1] 998.5] 2,225.7]
Chao2 [95% CI] 49.4 [39.7, 535.6 [468.3, 1,007.3 [913.2, 557.3 [449.4, 1,256.3 [1,149.8, 116.1 892.4 [804.3, 1,280.5 [1,167.8, 596.6 [533.6, 1,562.7 [1,455.1,
(rarefied) 79.4] 638.5] 1,136.2] 730.4] 1,398.0] [64.6, 280] 1,015.4] 1,431.1] 692.6] 1,702.1]
ACE 47.6 1,062.8 1,223.0 628.3 1,743.0 108.6 1,197.8 1,483.1 819.38 1,982.0
ACE (rarefied) 57.6 515.4 975.7 459.1 1,217.8 91.6 837.7 1,213.2 556.02 1,521.3
ICE 56.5 1,205.5 1,345.6 741.2 1,928 139.9 1,326.1 1,537.2 892.0 2,078.5
ICE (rarefied) 70.6 580.0 1,066.3 551.3 1,300.7 105.6 942.8 1,269.0 588.6 1,583.0
OTUs with match to 60.5 14.1 10.6 16.2 10.2 57.8 15.7 11.8 16.9 11.9
BOLD/GenBank, %
Unidentified OTUs, % NA 35.6 38.8 31.2 40.9 NA 26.8 27.1 23.8 28.3
Singletons, % 31.6 39.8 36.5 34.9 34.8 46.9 32.3 34.6 30.3 31.1

Each diversity estimate was calculated using both raw and rarefied OTU tables. ACE, abundance-based coverage estimator; CI, confidence interval; ICE, incidence-based coverage estimator; NA, not applicable.

8 of 11
Table S2. Percent contribution of individual OTUs to differences between localities and fractions (based on similarity of percentage analyses)
VA FL

500 to 500-μm 106-μm 500 to 500-μm 106-μm


OTU no. Kingdom Phylum Class Subclass/order Genus/species VA-FL 106 μm sessile sessile 106 μm sessile sessile

x17 Animalia Annelida Clitellata Haplotaxida Tubificoides 1.9 2.2


parapectinatus
x19 Animalia Annelida Polychaeta Eunicida Marphysa 2.9 5.5 1.6 4.6 3.0 3.0 1.6
sanguinea
x350 Animalia Annelida Polychaeta Sabellida Branchiomma 1.6 1.8
cf. bairdi
x1039 Animalia Annelida Polychaeta Spionida Polydora 1.7 1.7
cornuta
x283 Animalia Annelida Polychaeta Spionida Streblospio 1.2 4.2 4.2
benedicti
x432 Animalia Annelida Polychaeta Terebellida Polycirrus 1.0
x1066 Animalia Annelida Polychaeta Terebellida 1.1 1.0
x26 Animalia Annelida Polychaeta 1.3 1.4
x282 Animalia Annelida Polychaeta 1.9 1.7

Leray and Knowlton www.pnas.org/cgi/content/short/1424997112


x38 Animalia Annelida 1.0 1.0
x791 Animalia Arthropoda Malacostraca Amphipoda Caprella 1.0 1.1
penantis
x320 Animalia Arthropoda Malacostraca Amphipoda Monocorophium 1.0
acherusicum
x136 Animalia Arthropoda Malacostraca Amphipoda Gammarus 1.4 4.1 5.1
mucronatus
x165 Animalia Arthropoda Malacostraca Amphipoda 1.1 4.1 4.4
x372 Animalia Arthropoda Malacostraca Amphipoda 1.5 1.3
x83 Animalia Arthropoda Malacostraca Amphipoda 2.3 8.3 9.0
x102 Animalia Arthropoda Malacostraca Decapoda 2.8 2.5
x352 Animalia Arthropoda Malacostraca Decapoda Dyspanopeus 1.2 1.3
sayi
x998 Animalia Arthropoda Malacostraca Decapoda Panopeus 2.5 2.3
occidentalis
x418 Animalia Arthropoda Malacostraca Isopoda Cilicaea 2.1 1.8
x37 Animalia Arthropoda Malacostraca Stomatopoda Neogonodactylus 2.3 2.2
bredini
x597 Animalia Arthropoda Maxillopoda Calanoida Centropages 1.2 1.5
hamatus
x574 Animalia Arthropoda Maxillopoda Calanoida 1.9 1.3
x983 Animalia Arthropoda Maxillopoda Calanoida 1.3 1.1
x62 Animalia Arthropoda Maxillopoda Harpacticoida 1.8 1.7
x607 Animalia Arthropoda Maxillopoda Siphonostomatoida 1.5 1.4
x39 Animalia Arthropoda Maxillopoda 2.0 6.1 5.4
x397 Animalia Arthropoda Maxillopoda 1.3 1.2
x492 Animalia Arthropoda Maxillopoda 1.4 1.3
x871 Animalia Arthropoda Maxillopoda 1.9 1.4
x84 Animalia Arthropoda Ostracoda 4.1 6.2 7.1 5.8

9 of 11
Table S2. Cont.
VA FL

500 to 500-μm 106-μm 500 to 500-μm 106-μm


OTU no. Kingdom Phylum Class Subclass/order Genus/species VA-FL 106 μm sessile sessile 106 μm sessile sessile

x688 Animalia Arthropoda Pycnogonida Pantopoda 1.0 1.0


x1054 Animalia Arthropoda 2.7 2.7
x2 Animalia Arthropoda 1.0
x651 Animalia Arthropoda 2.7 2.4
x799 Animalia Arthropoda 1.2 1.0
x5 Animalia Bryozoa Gymnolaemata Cheilostomatida Bugula 1.2 4.1 3.9
neritina
x3 Animalia Bryozoa Gymnolaemata Cheilostomatida Biflustra 1.0 2.5 2.9
arborescens
x182 Animalia Bryozoa Gymnolaemata Cheilostomatida Schizoporella 4.0 11.8 10.6 7.7 8.0
errata
x245 Animalia Bryozoa Gymnolaemata Cheilostomatida 1.6 2.4 1.6 2.3
x198 Animalia Chordata Ascidiacea Phlebobranchia Ascidia 2.7 8.5 8.6
virginea

Leray and Knowlton www.pnas.org/cgi/content/short/1424997112


x13 Animalia Chordata Ascidiacea Stolidobranchia Symplegma 3.7 1.4 9.4 8.9
rubra
x1 Animalia Chordata 1.9 1.9
x437 Animalia Chordata 1.1 3.9 3.5
x841 Animalia Cnidaria Hydrozoa Leptothecata Obelia 11.5 12.4 12.2 11.0
bidentata
x1114 Animalia Cnidaria Hydrozoa 7.3 7.7 9.5 9.0
x139 Animalia Echinodermata Ophiuroidea Ophiurida Amphipholis 1.8 1.6
cf. squamata
x132 Animalia Mollusca Bivalvia Ostreoida Ostrea equestris 1.6 1.6
x544 Animalia Mollusca Gastropoda Littorinimorpha Crepidula plana 1.1 1.1
x575 Animalia Porifera Demospongiae Halichondrida 2.2 2.3
x1061 Animalia Porifera Demospongiae Homosclerophorida Oscarella 3.1 3.0 6.5 6.2
x8 Chromista Ochrophyta Phaeophyceae Ectocarpales Ectocarpus 1.0 1.3 1.4 1.1
siliculosus
x423 Plantae Rhodophyta Florideophyceae Ceramiales Polysiphonia 1.2 1.3
x366 Plantae Rhodophyta Florideophyceae Gracilariales Gracilaria 2.2 2.5
vermiculophylla
x733 Plantae Rhodophyta Florideophyceae 1.1 1.0
x11 Unidentified 1.5 1.4
x112 Unidentified 1.3 1.2
x131 Unidentified 1.8 1.7
x147 Unidentified 2.1 1.9
x158 Unidentified 1.0 3.7 3.7
x21 Unidentified 1.5 1.4
x36 Unidentified 1.0
x581 Unidentified 1.1 1.1

Only OTUs with a contribution superior to 1% are presented.

10 of 11
Table S3. Biomass and sediment in samples from VA and FL
Location: fraction Site Mean (±SD) biomass, g Mean (±SD) sediment, g

VA: 2 mm to 500 μm 1 19 (±13) 59 (±87)


2 19 (±7) 82 (±29)
3 17 (±13) 9 (±5)
VA: 500 to 100 μm 1 19 (±2) 791 (±386)
2 18 (±3) 497 (±400)
3 14 (±4) 835 (±216)
VA: Sessile 1 148 (±54) NA
2 249 (±28) NA
3 85 (±9) NA
FL: 2 mm to 500 μm 1 6 (±1) 4 (±2)
2 8 (±1) 25 (±19)
3 11 (±2) 8 (±4)
FL: 500 to 100 μm 1 11 (±2) 110 (±109)
2 12 (±2) 34 (±36)
3 21 (±10) 175 (±139)
FL: Sessile 1 118 (±25) NA
2 127 (±18) NA
3 225 (±8) NA

NA, not applicable, because there was no sediment in the sessile fraction.

Table S4. Tailed PCR primers


Primer label Primer sequence (5′–3′)

mlCOIint_Tag1 AGACGCGGWACWGGWTGAACWGTWTAYCCYCC
mlCOIint_Tag2 AGTGTAGGWACWGGWTGAACWGTWTAYCCYCC
mlCOIint_Tag3 ACTAGCGGWACWGGWTGAACWGTWTAYCCYCC
mlCOIint_Tag4 ACAGTCGGWACWGGWTGAACWGTWTAYCCYCC
mlCOIint_Tag5 ATCGACGGWACWGGWTGAACWGTWTAYCCYCC
mlCOIint_Tag6 ATGTCGGGWACWGGWTGAACWGTWTAYCCYCC
mlCOIint_Tag7 ATAGCAGGWACWGGWTGAACWGTWTAYCCYCC
jgHCO_Tag1 AGACGCTAIACYTCIGGRTGICCRAARAAYCA
jgHCO_Tag2 AGTGTATAIACYTCIGGRTGICCRAARAAYCA
jgHCO_Tag3 ACTAGCTAIACYTCIGGRTGICCRAARAAYCA
jgHCO_Tag4 ACAGTCTAIACYTCIGGRTGICCRAARAAYCA
jgHCO_Tag5 ATCGACTAIACYTCIGGRTGICCRAARAAYCA
jgHCO_Tag6 ATGTCGTAIACYTCIGGRTGICCRAARAAYCA
jgHCO_Tag7 ATAGCATAIACYTCIGGRTGICCRAARAAYCA

Leray and Knowlton www.pnas.org/cgi/content/short/1424997112 11 of 11

You might also like