Next-generation DNA sequencing me...
Helicos system only recently became com- mercially available, and the Pacific Biosciences instrument will likely launch commercially in early 2010. Each platform embodies a complex interplay of enzymology, chemistry, high-resolution optics, hardware, and software engineering. These instruments allow highly streamlined sample preparation steps prior to DNA sequencing, which provides a significant time savings and a minimal requirement for associated equipment in comparison to the highly automated, multistep pipelines necessary for clone-based high-throughput sequencing. By different approaches outlined below, each technology seeks to amplify single strands of a fragment library and perform sequencing reactions on the amplified strands. The fragment libraries are obtained by anneal- ing platform-specific linkers to blunt-ended fragments generated directly from a genome or DNA source of interest. Because the presence of adapter sequences means that the molecules then can be selectively amplified by PCR, no bacterial cloning step is required to amplify the genomic fragment in a bacterial intermediate as is done in traditional sequencing approaches. Importantly, both the Helicos and Pacific Biosystems instruments mentioned above are so-called ���single molecule��� sequencers and do not require any amplification of DNA fragments prior to sequencing. Another contrast between these instruments and capillary platforms is the run time required to generate data. Next-generation sequencers require longer run times of between 8 h and 10 days, depending upon the platform and read type (single end or paired ends). The longer run times result mainly from the need to im- age sequencing reactions that are occurring in a massively parallel fashion, rather than a peri- odic charge-coupled device (CCD) snapshot of 96 fixed capillaries. The yield of sequence reads and total bases per instrument run is signifi- cantly higher than the 96 reads of up to 750 bp each produced by a capillary sequencer run, and can vary from several hundred thousand reads (Roche/454) to tens of millions of reads (Il- lumina and Applied Biosystems SOLiD). The Charge-coupled device (CCD): a capacitor array used in optical scanners to capture images Emulsion PCR (ePCR): method for DNA amplification that uses a water in oil emulsion to isolate single DNA molecules in aqueous microreactors combination of streamlined sample preparation and long run times means that a single oper- ator can readily keep several next-generation sequencing instruments at full capacity. The following sections aim to introduce the reader to the primary features of each of the three most widely used next-generation platforms and to discuss strengths and weaknesses. Roche/454 FLX Pyrosequencer This next-generation sequencer was the first to achieve commercial introduction (in 2004) and uses an alternative sequencing technology known as pyrosequencing. In pyrosequencing, each incorporation of a nucleotide by DNA polymerase results in the release of pyrophos- phate, which initiates a series of downstream reactions that ultimately produce light by the firefly enzyme luciferase. The amount of light produced is proportional to the number of nu- cleotides incorporated (up to the point of de- tector saturation). In the Roche/454 approach (Figure 1), the library fragments are mixed with a population of agarose beads whose sur- faces carry oligonucleotides complementary to the 454-specific adapter sequences on the frag- ment library, so each bead is associated with a single fragment. Each of these fragment:bead complexes is isolated into individual oil:water micelles that also contain PCR reactants, and thermal cycling (emulsion PCR) of the micelles produces approximately one million copies of each DNA fragment on the surface of each bead. These amplified single molecules are then sequenced en masse. First the beads are ar- rayed into a picotiter plate (PTP a fused silica capillary structure) that holds a single bead in each of several hundred thousand single wells, which provides a fixed location at which each se- quencing reaction can be monitored. Enzyme- containing beads that catalyze the downstream pyrosequencing reaction steps are then added to the PTP and the mixture is centrifuged to surround the agarose beads. On instrument, the PTP acts as a flow cell into which each pure nucleotide solution is introduced in a step- wise fashion, with an imaging step after each www.annualreviews.org ��� Next-Generation DNA Sequencing Methods 389 by HARVARD UNIVERSITY on 03/08/09. For personal use only.
Anneal sstDNA to an excess of DNA capture beads Emulsify beads and PCR reagents in water-in-oil microreactors Clonal amplification occurs inside microreactors Break microreactors and enrich for DNA-positive beads Amplified sstDNA library beads Quality filtered bases a b c DNA library preparation Emulsion PCR Sequencing A A A B B B 4.5 hours 8 hours 7.5 hours Ligation Selection (isolate AB fragments only) ���Genome fragmented by nebulization ���No cloning no colony picking ���sstDNA library created with adaptors ���A/B fragments selected using avidin-biotin purification gDNA sstDNA library sstDNA library Bead-amplified sstDNA library ���Well diameter: average of 44 ��m ���400,000 reads obtained in parallel ���A single cloned amplified sstDNA bead is deposited per well 390 Mardis by HARVARD UNIVERSITY on 03/08/09. For personal use only.
nucleotide incorporation step. The PTP is seated opposite a CCD camera that records the light emitted at each bead. The first four nu- cleotides (TCGA) on the adapter fragment ad- jacent to the sequencing primer added in library construction correspond to the sequential flow of nucleotides into the flow cell. This strategy allows the 454 base-calling software to calibrate the light emitted by a single nucleotide incor- poration. However, the calibrated base calling cannot properly interpret long stretches ( 6) of the same nucleotide (homopolymer run), so these areas are prone to base insertion and dele- tion errors during base calling. By contrast, because each incorporation step is nucleotide specific, substitution errors are rarely encoun- tered in Roche/454 sequence reads. The FLX instrument currently provides 100 flows of each nucleotide during an 8-h run, which produces an average read length of 250 nucleotides (an average of 2.5 bases per flow are incorporated). These raw reads are processed by the 454 analysis software and then screened by various quality filters to remove poor-quality sequences, mixed sequences (more than one ini- tial DNA fragment per bead), and sequences without the initiating TCGA sequence. The resulting reads yield 100 Mb of quality data on average. Downstream of read processing, an as- sembly algorithm (Newbler) can assemble FLX reads.Althoughshorterthanreadsderivedfrom capillary sequencers, FLX reads are of sufficient lengthtoassemblesmallgenomessuchasbacte- rial and viral genomes to high quality and con- tiguity. As mentioned, the lack of a bacterial cloning step in the Roche/454 process means that sequences not typically sampled in a WGS approach owing to cloning bias will be more likely represented in a FLX data set, which con- Bridge amplification: allows the generation of in situ copies of a specific DNA molecule on an oligo-decorated solid support tributes to more comprehensive genome cover- age. Illumina Genome Analyzer The single molecule amplification step for the Illumina Genome Analyzer starts with an Illumina-specific adapter library, takes place on the oligo-derivatized surface of a flow cell, and is performed by an automated device called a Cluster Station. The flow cell is an 8-channel sealed glass microfabricated device that allows bridge amplification of fragments on its surface, and uses DNA polymerase to produce multiple DNA copies, or clusters, that each represent the single molecule that initiated the cluster ampli- fication. A separate library can be added to each of the eight channels, or the same library can be used in all eight, or combinations thereof. Each cluster contains approximately one mil- lion copies of the original fragment, which is sufficient for reporting incorporated bases at the required signal intensity for detection dur- ing sequencing. The Illumina system utilizes a sequencing- by-synthesis approach in which all four nu- cleotides are added simultaneously to the flow cell channels, along with DNA polymerase, for incorporation into the oligo-primed clus- ter fragments (see Figure 2 for details). Specif- ically, the nucleotides carry a base-unique flu- orescent label and the 3 -OH group is chem- ically blocked such that each incorporation is a unique event. An imaging step follows each base incorporation step, during which each flow cell lane is imaged in three 100-tile segments by the instrument optics at a cluster density per tile of 30,000. After each imaging step, the 3 blocking group is chemically removed ��� ��������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������� Figure 1 The method used by the Roche/454 sequencer to amplify single-stranded DNA copies from a fragment library on agarose beads. A mixture of DNA fragments with agarose beads containing complementary oligonucleotides to the adapters at the fragment ends are mixed in an approximately 1:1 ratio. The mixture is encapsulated by vigorous vortexing into aqueous micelles that contain PCR reactants surrounded by oil, and pipetted into a 96-well microtiter plate for PCR amplification. The resulting beads are decorated with approximately 1 million copies of the original single-stranded fragment, which provides sufficient signal strength during the pyrosequencing reaction that follows to detect and record nucleotide incorporation events. sstDNA, single-stranded template DNA. www.annualreviews.org ��� Next-Generation DNA Sequencing Methods 391 by HARVARD UNIVERSITY on 03/08/09. For personal use only.