Systematic exploration of error sources in pyrosequencing flowgram data

71Citations
Citations of this article
163Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Motivation: 454 pyrosequencing, by Roche Diagnostics, has emerged as an alternative to Sanger sequencing when it comes to read lengths, performance and cost, but shows higher per-base error rates. Although there are several tools available for noise removal, targeting different application fields, data interpretation would benefit from a better understanding of the different error types. Results: By exploring 454 raw data, we quantify to what extent different factors account for sequencing errors. In addition to the well-known homopolymer length inaccuracies, we have identified errors likely to originate from other stages of the sequencing process. We use our findings to extend the flowsim pipeline with functionalities to simulate these errors, and thus enable a more realistic simulation of 454 pyrosequencing data with flowsim. © The Author(s) 2011. Published by Oxford University Press.

Cite

CITATION STYLE

APA

Balzer, S., Malde, K., & Jonassen, I. (2011). Systematic exploration of error sources in pyrosequencing flowgram data. Bioinformatics, 27(13). https://doi.org/10.1093/bioinformatics/btr251

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free