Abstract
Next Generation Sequencing (NGS) technologies produce large quantities of short length reads with higher error rates. Erroneous reads that cannot be aligned, are either ignored during de-novo sequencing, or must be suitably corrected. Such reads pose problems for mapping as well, since it is difficult to distinguish errors from true variants. Methods for detection and correction of errors typically rely on frequencies of substrings of the reads. suffix trees are often utilized for this purpose, since they can be used to index and count the frequencies of substrings of all lengths. Existing suffix -tree based methods detect errors by identifying statistically under-represented branches (suffix) and x them. However, they do not refer back to the reads to put the correction in context. Since an error in a single read manifests itself at multiple nodes of a suffix tree, a read-driven approach that relies on its multiple manifestations is expected to perform better. Based on this observation, we develop an algorithm, Pluribus, which reconciles corrections suggested by multiple manifestations of an error using a voting scheme. We compare the accuracy of Pluribus in detecting and correcting errors against existing error correction techniques using simulated sequencing data. We also assess the impact of error correction on the performance of sequence assembly. Our results show that Pluribus corrects errors with improved precision and enables the assembler to generate longer contigs, particularly when the genome is longer, or coverage is lower. Copyright © 2007 by the Association for Computing Machinery.
Author supplied keywords
Cite
CITATION STYLE
Savel, D. M., Laframboise, T., Grama, A., & Koyutürk, M. (2013). Suffix-tree based error correction of NGS reads using multiple manifestations of an error. In 2013 ACM Conference on Bioinformatics, Computational Biology and Biomedical Informatics, ACM-BCB 2013 (pp. 351–358). https://doi.org/10.1145/2506583.2506644
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.