Repeats form a major class of sequence in genomes with implications for functional genomics and practical problems. Their detection and analysis pose a number of challenges in genomic sequence analysis, especially if the genome is not completely sequenced. The most abundant and evolutionary active forms of repeats are found in the form of families of long similar sequences. We present a novel method for repeat family detection and characterization in cases where the target genome sequence is not completely known. Therefore we first establish the sequence graph, a compacted version of sparse de Bruijn graphs. Using appropriate analysis of the structure of this graph and its connected components after local modifications, we are able to devise two algorithms for repeat family detection. The applicability of the methods is shown for both simulated and real genomic data sets. © 2008 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Amgarten Quitzau, J. A., & Stoye, J. (2008). Detecting repeat families in incompletely sequenced genomes. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5251 LNBI, pp. 342–353). https://doi.org/10.1007/978-3-540-87361-7_29
Mendeley helps you to discover research relevant for your work.