Efficient algorithms for regular expression constrained sequence alignment

7Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Imposing constraints is an effective means to incorporate biological knowledge into alignment procedures. As in the PROSITE database, functional sites of proteins can be effectively described as regular expressions. In an alignment of protein sequences it is natural to expect that functional motifs should be aligned together. Due to this motivation, in CPM 2005 Arslan introduced the regular expression constrained sequence alignment problem and proposed an algorithm which can take time and space up to O(|∑| 2 | V| 4 n 2) and O(|∑| 2 |V| 4 n), respectively, where ∑ is the alphabet, n is the sequence length, and V is the set of states in an automaton equivalent to the input regular expression. In this paper we propose a more efficient algorithm solving this problem which takes O(|V| 3 n 2) time and O(|V| 2 n) space in the worst case. If |V| = O(log n) we propose another algorithm with time complexity O(|V| 2 log |V| n 2). The time complexity of our algorithms is independent of ∑, which is desirable in protein applications where the formulation of this problem originates; a factor of |∑| 2 = 400 in the time complexity of the previously proposed algorithm would significantly affect the efficiency in practice. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Chung, Y. S., Lu, C. L., & Tang, C. Y. (2006). Efficient algorithms for regular expression constrained sequence alignment. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4009 LNCS, pp. 389–400). Springer Verlag. https://doi.org/10.1007/11780441_35

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free