PANDAseq: Paired-end assembler for illumina sequences

Andre P. Masella; Andrea K. Bartram; Jakub M. Truszkowski; Daniel G. Brown; Josh D. Neufeld

Journal ArticleOPEN ACCESS

PANDAseq: Paired-end assembler for illumina sequences

BMC Bioinformatics (2012) 13(1)

DOI: 10.1186/1471-2105-13-31

1.6kCitations

1.2kReaders

Abstract

Background: Illumina paired-end reads are used to analyse microbial communities by targeting amplicons of the 16S rRNA gene. Publicly available tools are needed to assemble overlapping paired-end reads while correcting mismatches and uncalled bases; many errors could be corrected to obtain higher sequence yields using quality information.Results: PANDAseq assembles paired-end reads rapidly and with the correction of most errors. Uncertain error corrections come from reads with many low-quality bases identified by upstream processing. Benchmarks were done using real error masks on simulated data, a pure source template, and a pooled template of genomic DNA from known organisms. PANDAseq assembled reads more rapidly and with reduced error incorporation compared to alternative methods.Conclusions: PANDAseq rapidly assembles sequences and scales to billions of paired-end reads. Assembly of control libraries showed a 4-50% increase in the number of assembled sequences over naïve assembly with negligible loss of "good" sequence. © 2012 Masella et al.; licensee BioMed Central Ltd.

Cite

CITATION STYLE

APA

Masella, A. P., Bartram, A. K., Truszkowski, J. M., Brown, D. G., & Neufeld, J. D. (2012). PANDAseq: Paired-end assembler for illumina sequences. BMC Bioinformatics, 13(1). https://doi.org/10.1186/1471-2105-13-31

PANDAseq: Paired-end assembler for illumina sequences

Abstract

Cite

Register to see more suggestions