You are on page 1of 6

Pirosecuenciacin

La pirosecuenciacin es una nueva tcnica de secuenciacin de DNA, desarrollada inicialmente por Mostaza Rognaghi y colaboradores a finales de los aos 1990 (Ronaghi et al 1996, 1998, 2001). Est basada en la secuenciacin por sntesis, acoplando la sntesis de DNA a una reaccin quimioluminiscente, lo que permite una rpida determinacin de secuencias en tiempo real. La tcnica utiliza cuatro reacciones enzimticas que tienen lugar en un nico tubo en el que se monitoriza la sntesis de la cadena complementaria de DNA, usando como molde DNA de cadena simple. Los nucletidos son aadidos de forma consecutiva a la reaccin y, en caso de incorporacin, se libera pirofosfato inorgnico (PPi). PPi desencadena una serie de reacciones que resultan en la produccin de luz, de forma proporcional a la cantidad de DNA y el nmero de nucletidos incorporados. La generacin de luz se detecta en forma de pico y se graba gracias a un sistema de deteccin, reflejando la actividad de los enzimas en la reaccin.

La tcnica de pirosecuenciacin se realiza en 5 pasos: (1) ssDNA amplificado por PCR hibrida con el cebador de secuenciacin y se incuba con los enzimas DNA polimerasa, ATP sulfurilasa, luciferasa y apirasa, ms los sustratos adenosina-5-fosfosulfato (APS) y luciferina. (2) La adicin de uno de los 4 dNTPs inicia el segundo paso, en el que la DNA polimerasa cataliza la incorporacin del dNTP al molde si es complementario.

Es importante notar que si hay incorporacin se libera PPi equivalente a la cantidad de dNTP incorporado. (3) La ATP-sulfurilasa convierte cuantitativamente el PPi en ATP en presencia de APS. El ATP generado permite la conversin de la luciferina en oxiluciferina por accin de la luciferasa, generando luz visible en cantidades proporcionales a la cantidad de ATP presente. La luz emitida es detectada por una cmara CCD y puee ser analizada pro el programa. Cada seal luminosa es proporcional a la cantidad de nucletidos incorporados. (4) Para continuar con la secuenciacin, es esencial la degradacin de aquellos dNTPs que no han sido incorporados. La apirasa es el enzima encargado de ello. (5) Nuevos dNTPs pueden ser aadidos para iniciar un nuevo ciclo.

UTILIDAD La pirosecuenciacin es el mtodo de eleccin para la secuenciacin de fragmentos cortos de DNA, deteccin de SNPs y anlisis de metilacin. Estos anlisis son cruciales en investigacin biolgica, gentica y en algunas aplicaciones mdicas y forenses. La tcnica ha sido adems perfeccionada para la secuenciacin de genomas completos por la compaa 454. Hasta el momento, este es el sistema ms rpido de secuenciacin genmica. Inicialmente presentaba serias limitaciones debido a la corta longitud de las secuencias generadas, lo que complica en gran medida el proceso de ensamblaje, particularmente en el caso de genomas con abundante DNA repetitivo. Por ello, esta tcnica sola utilizarse normalmente para resecuenciar genomas o para secuenciar genomas cuando se dispone de otros genomas muy prximos ya secuenciados. Sin embargo, la longitud de las

secuencias generadas se ha ido incrementando paulatinamente (en noviembre aparecer la nueva versin capaz de generar secuencias de 400 bp), y es posible ordenar los fragmentos generados mediante mtodos mejorados de secuenciacin y ensamblaje.

VENTAJAS Es totalmente automatizada, fiable y precisa, y permite el anlisis de un gran nmero de muestras en un breve lapso de tiempo. Adems su coste es ms reducido que el de los mtodos tradicionales de secuenciacin.

.: LIFESEQUENCING :.

Page 1 of 2

1- Preparacin de la librera de DNA La preparacin de la librera de DNA consiste en el fraccionamiento del DNA genmico (gDNA) en pequeos fragmentos (de 300 a 500 pb) que son posteriormente pulidos (extremos romos) y entonces, los adaptadores A y B se ligan en ambos extremos. Estos adaptadores proporcionan las secuencias de hibridacin para la posterior amplificacin y secuenciacin de los fragmentos de la librera. El adaptar B est biotinilado en su extremo 5' el cual permite la inmovilizacin de la librera mediante las perlas recubiertas de estreptoavidina. Despus de la reparacin de las mellas, las hebras no biotiniladas se separan de las perlas y se usan como librera molde de DNA de hebra sencilla (sstDNA). La librera sstDNA se analiza para determinar su calidad y mediante titulacin se determina la proporcin ptima (molculas de DNA:perlas) necesaria para la PCR en emulsin (emPCR).

2- emPCR La librera sstDNA se inmoviliza en las perlas. Cada perla contiene una nica molcula de sstDNA de la librera. La perla unida a la librera se emulsiona con los reactivos de amplificacin en una micela de agua y aceite. Cada perla queda englobada en su propio microreactor dentro del cual ocurre la amplificacin mediante PCR. El resultado ser una perla inmovilizada conteniendo fragmentos de DNA amplificados clonalmente.

3- Secuenciacin Las perlas unidas a los fragmentos de la librera sstDNA se aaden al DNA Bead Incubation Mix (contiene DNA polimerasa) y se distribuyan en capas dentro de la placa junto con unas perlas enzimticas que contienen luciferasa y sulfurilasa. La capa de perlas enzimticas asegura que las perlas con DNA permanecen en el interior del pocillo durante la reaccin de secuenciacin. El proceso de deposicin de las perlas maximiza el nmero de pocillos que contienen una nica perla con DNA amplificado (evitando ms de una perla unida a sstDNA por pocillo). Cuando la placa est correctamente cargada se coloca en el equipo

http://www.lifesequencing.com/technologiaworkflow.html

07/10/2008

.: LIFESEQUENCING :.

Page 2 of 2

donde los reactivos de secuenciacin (tampones y nucletidos) fluirn a travs de los pocillos de la placa. Durante el flujo de nucletidos, cada una de los cientos de miles de perlas con millones de copias de DNA se secuencian en paralelo. Si un nucletido es complementario a la cadena molde en algn pocillo, la polimerasa extiende la hebra existente de DNA mediante la adicin de nucletido(s). La adicin de uno (o ms) nucletido(s) resulta en una reaccin que genera una seal de luz que es recogida por la cmara CCD del equipo. La intensidad de la seal es proporcional al nmero de nucletidos incorporados en un solo flujo de nucletidos.

* The Genome Sequencer 20 / FLX System (GS 20 / GS FLX) is a product by Roche Applied Science developed by 454 Life Sciences

http://www.lifesequencing.com/technologiaworkflow.html

07/10/2008

ADVERTISING FEATURE

APPLICATION NOTES

2008 Nature Publishing Group http://www.nature.com/naturemethods

3K Long-Tag Paired End sequencing with the Genome Sequencer FLX System
The Genome Sequencer FLX System from Roche and 454 Life Sciences is a versatile sequencing platform suitable for a wide range of applications, including de novo sequencing and assembly of genomic DNA, transcriptome sequencing, metagenomics analysis and amplicon sequencing. The Genome Sequencer FLX enables long sequence reads separated by kilobase distances of genomic DNA. These Long-Tag Paired End reads enable improved de novo assemblies and genomic structural variation studies.
454 Life Sciences has developed and commercially released a new protocol for generating a library of paired-end fragments to determine the orientation and relative positions of contigs produced by de novo shotgun sequencing and assembly. This 3K Long-Tag Paired End protocol (Fig. 1) can also be used to identify genomic structural variations1 and their associated breakpoints. Structural variation of the genome, involving large, kilobase- to megabase-sized deletions, duplications, insertions, inversions and complex combinations of rearrangements, is widespread in humans and is presumably responsible for a considerable amount of phenotypic variation. The 3K Long-Tag Paired End library DNA fragments comprise an approximately 250-bp fragment with a 44-mer adaptor sequence in the middle, flanked by 100-mer sequences, on average. The two flanking 100-bp sequences are segments of DNA that were originally located approximately 3 kb apart in the genome of interest. Traditional approaches to the sequencing of paired-end reads rely upon inserting a DNA fragment into a vector, such as a bacterial artificial chromosome or a fosmid, cloning this into bacteria and subsequently generating two sequences, one from each end of the vector. These methods entailed weeks of laboratory work and could cost several hundred thousand dollars to prepare the libraries needed for Sanger sequencing. The Genome Sequencer FLX method presented here, which requires no cloning, generates up to 200,000 paired-end reads from a single Genome Sequencer FLX instrument run with a total elapsed timefrom genomic DNA to resultof less than 4 days.
SA
Bio Bio

of the high-molecular-weight DNA sample; the size distribution of the fragments (on average 3 kb) determines the distance between the paired-end sequencing tags. The fragments are methylated to prevent EcoRI cleavage, Hairpin Adaptors (biotinylated and containing nonmethylated EcoRI recognition sites, provided in the GS Paired End Adaptor Kit) are ligated onto both ends, and all DNA species that are not protected by hairpins are removed by exonuclease digestion. The remaining long insert fragments are circularized by digestion with EcoRI to remove the terminal hairpin structures, providing cohesive ends for ligation. The resulting 3-kb circular fragments contain the 44-bp linker (the remainder of the two Hairpin Adaptors) joining the two ends of the fragmented DNA. The DNA circles are then fractionated by nebulization, generating molecules that are a few hundred base pairs in length. Long Paired End Adaptors are ligated to the ends of the linker-positive fragments. The adaptors provide priming sequences for both amplification and sequencing of the Paired End library fragments. This library is ready for

Fragment, methylate and polish Genomic DNA


Met Met

Biotinylated Hairpin Adaptor ligation Bio


Bio Met

Bio Met

Exonuclease treatment and Adaptor Bio cleavage Bio


Bio Bio Met Bio

Bio Met

Streptavidin capture of biotinylated Adaptors

Circularization
Bio Bio Bio Bio

Sample preparation protocol


The preparation of a 3K Long-Tag Paired End library is depicted schematically in Figure 1. The protocol begins with fragmentation
A B Long Paired End Adaptor ligation A

Nebulization

~100-base tags Clonally amplified fragments

Bio Bio

SA

emPCR A B

454 Sequencing

Thomas Jarvie1 & Timothy Harkins2


1454 Life Sciences, 20 Commercial Street, Branford, Connecticut 06405, USA. 2Roche Diagnostics, Roche Applied Science, 9115 Hague Road, Indianapolis, Indiana 46250, USA. Correspondence should be addressed to T.J. (thomas.jarvie@roche.com).

Figure 1 | A schematic of the 3K Long-Tag Paired End sequencing protocol. SA, streptaviden; Met, methylated; Bio, biotin.

NATURE METHODS | MAY 2008 | i

1,000

Nu

500 0

APPLICATION NOTES
1250 751 1,000

ADVERTISING FEATURE

1,501 2,251 3,001 3,751 4,501 5,251 6,001 6,751 1,750 2,500 3,250 4,000 4,750 5,500 6,250 7,000

Distance (bp)

Table 1 | Comparison of data from three methods of de novo assembly


Shotgun E. coli K-12 Coverage depth Assembly contigs Assembly coverage Overall accuracy Average contig (kb) Largest contig (kb) Scaffolds
575,000 576,000 577,000 578,000 579,000 580,000 581,000 582,000 583,000 584,000

50:50 mixa 20 113 97.67% 99.999% 40.3 268.0 11 20 65 98.08% 99.998% 32.5 291.8 7 20 33 97.59% 99.998% 48.9 304.5 4

3K LT PEa 20 121 97.49% 99.999% 37.5 209.3 10 20 278 96.36% 99.998% 7.5 49.8 15 20 39 97.59% 99.997% 41.5 304.5 5

20 147 97.62% 100.000% 31.1 268.0 20 52 98.01% 99.997% 40.6 482.3 23 32 97.54% 99.993% 50.4 304.5

2008 Nature Publishing Group http://www.nature.com/naturemethods

T. thermophilus Coverage depth Assembly contigs Assembly coverage Overall accuracy Average contig (kb) Largest contig (kb) Scaffolds C. jejuni Coverage depth Assembly contigs Assembly coverage Overall accuracy Average contig (kb) Largest contig (kb) Scaffolds

Genomic position

Figure 2 | Sequencing results from a typical 3K Long-Tag Paired End library prep of E. coli K-12. A region of the de novo assembly of E. coli K-12, with the de novo assembled contigs covering the region shown in blue along the bottom axis. The paired-end reads generated with this protocol are capable of bridging the 0.2-kb and 1.5-kb gaps between the contigs, highlighted in green.

emulsion-based clonal amplification (emPCRTM) using the GS emPCR Kit II (Amplicon A, Paired End) and for sequencing using appropriate GS Sequencing and GS PicoTiterPlate kits and the Genome Sequencer FLX instrument.

Assemblies
Examples of assemblies resulting from the 3K Long-Tag Paired End protocol are shown in Table 1. We assembled three bacterial genomes, Escherichia coli K-12, Thermus thermophilus and Campylobacter jejuni, by three different sequencing methods: only shotgun sequencing reads (250300 bases in length); a 50:50 mix of shotgun and 3K Long-Tag Paired End sequencing reads; and only 3K Long-Tag Paired End sequencing reads. In all of the assemblies, the number of reads in the data sets was minimized to an approximately 20 depth of coverage by randomly discarding sequence reads. The data used in the 50:50 mix were a 10 depth of shotgun reads and 10 depth of reads generated using the 3K Long-Tag Paired End protocol. The GS de novo Assembler Software (version 1.1.03) identifies the reads as either linker positive or linker negative. The initial step in the assembly is the generation of a de novo shotgun assembly using the linker-negative reads and the DNA reads on either side of the linker. Once the de novo assembler places the shotgun reads into contigs, the linker-positive reads (Long-Tag Paired End reads) are used to orient the contigs into scaffolds (Fig. 2). This assembly method is used with 3K Long-Tag Paired End read data alone and when the 3K Long-Tag Paired End data are mixed with shotgun data. All three methods of assembly generate comprehensive, highly accurate assemblies (Table 1). The choice of which experimental approach and assembly method to use depends on the goals of the research. If a quick view of the genome (for example, to identify which genes are present) is desired, a shotgun-only approach is suitable. If the research goal is to generate a high-quality draft of the target genome, then the inclusion of Long-Tag Paired End data is the best option.
ii | MAY 2008 | NATURE METHODS

The assembly coverage represents the non-repeat portions of the genome. Overall accuracy was determined by mapping the assembly contigs against the reference genome and reporting discrepancies. Inclusion of paired-end data into the assemblies aligns the assembly contigs into scaffolds.
a50:50

mix, 50:50 mix of shotgun and 3K Long-Tag Paired End sequencing data; 3K LT PE, pure 3K Long-Tag Paired End sequencing data.

Summary
The sequencing of kilobase-sized inserts is quite valuable for a number of applications, including improved de novo assembly and identification of genomic structural variations. The 3K Long-Tag Paired End protocol provides a quick, efficient and cost-effective method for generating hundreds of thousands of sequence reads, each containing a pair of ~100-bp reads separated by 3-kb size inserts. Future development plans include a protocol for sequencing tags separated by 15- to 20-kb distances. The combination of both 3-kb and longer paired-end spacing will better enable the assembly of larger and more complex genomes. Additional information about the Genome Sequencer System is available from Roche Applied Science (http://www.genome-sequencing.com). 454, 454 Life Sciences, 454 Sequencing, emPCR and PicoTiterPlate are trademarks of 454 Life Sciences Corporation, Branford, Connecticut, USA. For life science research use only. Not for use in diagnostic procedures. License disclaimer information is available online (http://www. genome-sequencing.com).
1. Korbel, J.O. et al. Paired-end mapping reveals extensive structural variation in the human genome. Science 318, 420426 (2007).

This article was submitted to Nature Methods by a commercial organization and has not been peer reviewed. Nature Methods takes no responsibility for the accuracy or otherwise of the information provided.

You might also like