Adjust quality scores from alignment and improve sequencing accuracy

31Citations
Citations of this article
71Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In shotgun sequencing, statistical reconstruction of a consensus from alignment requires a model of measurement error. Churchill and Waterman proposed one such model and an expectation-maximization (EM) algorithm to estimate sequencing error rates for each assembly matrix. Ewing and Green defined Phred quality scores for base-calling from sequencing traces by training a model on a large amount of data. However, sample preparations and sequencing machines may work under different conditions in practice and therefore quality scores need to be adjusted. Moreover, the information given by quality scores is incomplete in the sense that they do not describe error patterns. We observe that each nucleotide base has its specific error pattern that varies across the range of quality values. We develop models of measurement error for shotgun sequencing by combining the two perspectives above. We propose a logistic model taking quality scores as covariates. The model is trained by a procedure combining an EM algorithm and model selection techniques. The training results in calibration of quality values and leads to a more accurate construction of consensus. Besides Phredscores obtained from ABI sequencers, we apply the same technique to calibrate quality values that come along with Beckman sequencers. © Oxford University Press 2004; all rights reserved.

Cite

CITATION STYLE

APA

Li, M., Nordborg, M., & Li, L. M. (2004). Adjust quality scores from alignment and improve sequencing accuracy. Nucleic Acids Research, 32(17), 5183–5191. https://doi.org/10.1093/nar/gkh850

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free