Motif discovery recently received considerable interest from both computational biologists and computer scientists. Identifying motifs is greatly significant for understanding the mechanism behind regulating gene expressions. Although many algorithms have been proposed to solve this problem, only some of them use prior information about motifs. In this paper, we propose a method to limit the search space of the existing methods for motif discovery. Our method is based on the following observation: if some elements are conserved, then these elements may be part of a conserved motif. Further, the proposed approach is based on the divide and conquer concept, where we divide each DNA sequence into four subsequences, one subsequence per each of the four letters, representatives of the nucleotides, namely {A, C, G, T}. Then, we consider the subsequences for G as the major source for deciding on candidate motifs because G is found in almost all the transcription factors binding sites; the decision is supported and enhanced by the subsequences of the other three letters. We have applied this idea to yst04 and hm03r datasets; the results are encouraging as we have successfully predicted the locations of some of the motifs hidden within the analyzed sequences. © Springer-Verlag Berlin Heidelberg 2008.
CITATION STYLE
Alshalalfa, M., & Alhajj, R. (2008). Motif location prediction by divide and conquer. Communications in Computer and Information Science, 13, 102–113. https://doi.org/10.1007/978-3-540-70600-7_8
Mendeley helps you to discover research relevant for your work.