Indexing strategies for rapid searches of short words in genome sequences

62Citations
Citations of this article
59Readers
Mendeley users who have this article in their library.

Abstract

Searching for matches between large collections of short (14-30 nucleotides) words and sequence databases comprising full genomes or transcriptomes is a common task in biological sequence analysis. We investigated the performance of simple indexing strategies for handling such tasks and developed two programs, fetchGWI and tagger, that index either the database or the query set. Either strategy outperforms megablast for searches with more than 10,000 probes. FetchGWI is shown to be a versatile tool for rapidly searching multiple genomes, whose performance is limitted in most cases by the speed of access to the filesystem. We have made publicly available a Web interface for searching the human, mouse, and several other genomes and transcriptomes with oligonucleotide queries. © 2007 Iseli et al.

Cite

CITATION STYLE

APA

Iseli, C., Ambrosini, G., Bucher, P., & Jongeneel, C. V. (2007). Indexing strategies for rapid searches of short words in genome sequences. PLoS ONE, 2(6). https://doi.org/10.1371/journal.pone.0000579

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free