Maximum likelihood estimation of incomplete genomic spectrum from HTS data

2Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

High-throughput sequencing makes possible to process samples containing multiple genomic sequences and then estimate their frequencies or even assemble them. The maximum likelihood estimation of frequencies of the sequences based on observed reads can be efficiently performed using expectation-maximization (EM) method assuming that we know sequences present in the sample. Frequently, such knowledge is incomplete, e.g., in RNA-seq not all isoforms are known and when sequencing viral quasispecies their sequences are unknown. We propose to enhance EM with a virtual string and incorporate it into frequency estimation tools for RNA-Seq and quasispecies sequencing. Our simulations show that EM enhanced with the virtual string estimates string frequencies more accurately than the original methods and that it can find the reads from missing quasispecies thus enabling their reconstruction. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Mangul, S., Astrovskaya, I., Nicolae, M., Tork, B., Mandoiu, I., & Zelikovsky, A. (2011). Maximum likelihood estimation of incomplete genomic spectrum from HTS data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6833 LNBI, pp. 213–224). https://doi.org/10.1007/978-3-642-23038-7_19

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free