Flexible identification of structural objects in nucleic acid sequences: Palindromes, mirror repeats, pseudoknots and triple helices

Marie France Sagot; Alain Viari

Conference Proceedings

Flexible identification of structural objects in nucleic acid sequences: Palindromes, mirror repeats, pseudoknots and triple helices

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (1997) 1264 224-246

DOI: 10.1007/3-540-63220-4_62

9Citations

9Readers

Get full text

Abstract

This paper presents algorithms for flexibly identifying structural objects in nucleic acid sequences. These objects are palindromes, mirror repeats, pseudoknots and triple helices. We further explore here the idea of a model against which the words in a sequence are compared for finding these structural objects [17]. In the present case, models are words defined over the alphabet of nucleotides that have both direct and inverse occurrences in the sequence. Moreover, errors (substitutions, deletions and insertions) are allowed between a model and its inverse occurrences. Helix stems may therefore present bulges or interior loops, and mirror repeats need not be exact. Reasonably efficient performance comes from the fact that the parts composing the structures are kept separated until the end and that filtering for valid occurrences (occurrences that may form part of such a structure) can be done in O(n) time where n is the length of the sequence. The time complexity for the searching phase (that is, before the structural parts are put together at the end) of both algorithms presented here (one for palindromes and mirror repeats, the other for pseudoknots and triple helices) is then O(nk(e + 1)(1 + min{dmax- dmin + 1 + e, ke |Σ|e})) where n is the length of the sequence, dmax and drain are, respectively, the maximal and minimal length of a hairpin loop, k is either the maximum length kma~ of a model, is a fixed length or represents the maximum value of a range of lengths, e is the maximum number of errors allowed (substitutions, deletions and insertions) and | Σ | is the size of the alphabet of nucleotides.

Author supplied keywords

Cite

CITATION STYLE

APA

Sagot, M. F., & Viari, A. (1997). Flexible identification of structural objects in nucleic acid sequences: Palindromes, mirror repeats, pseudoknots and triple helices. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1264, pp. 224–246). Springer Verlag. https://doi.org/10.1007/3-540-63220-4_62

Flexible identification of structural objects in nucleic acid sequences: Palindromes, mirror repeats, pseudoknots and triple helices

Abstract

Author supplied keywords

Cite

Register to see more suggestions