Seed-set construction by equi-entropy partitioning for efficient and sensitive short-read mapping

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Spaced seeds have been shown to be superior to continuous seeds for efficient and sensitive homology search based on the seed-and-extend paradigm. Much the same is true in genome mapping of high-throughput short-read data. However, a highly sensitive search with multiple spaced patterns often requires the use of a great amount of index data. We propose a novel seed-set construction method for efficient and sensitive genome mapping of short reads with relatively high error rates, which uses only continuous seeds of variable length allowing a few errors. The seed lengths and allowable error positions are optimized on the basis of entropy, which is a measure of ambiguity or repetitiveness of mapping positions. These seeds can be searched efficiently using the Burrows-Wheeler transform of the reference genome. Evaluation using actual biological SOLiD sequence data demonstrated that our method was competitive in speed and sensitivity using much less memory and disk space in comparison to spaced-seed methods. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Kimura, K., Koike, A., & Nakai, K. (2011). Seed-set construction by equi-entropy partitioning for efficient and sensitive short-read mapping. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6833 LNBI, pp. 151–162). https://doi.org/10.1007/978-3-642-23038-7_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free