We investigate the structure of the set of gapped motifs (repeated patterns with don't cares) of a given string of symbols. A natural equivalence classification is introduced for the motifs, based on their pattern of occurrences, and another classification for the occurrence patterns, based on the induced motifs. Quadratic-time algorithms are given for finding a maximal representative for an equivalence class while the problems of finding a minimal representative are shown NP-complete. Maximal gapped motifs are shown to be composed of blocks that are maximal non-gapped motifs. These can be found using suffix-tree techniques. This leads to a bound on the number of gapped motifs that have a fixed number of non-gapped blocks. © Springer-Verlag Berlin Heidelberg 2007.
CITATION STYLE
Ukkonen, E. (2007). Structural analysis of gapped motifs of a string. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4708 LNCS, pp. 681–690). Springer Verlag. https://doi.org/10.1007/978-3-540-74456-6_60
Mendeley helps you to discover research relevant for your work.