A compact mathematical programming formulation for DNA motif finding

11Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In the motif finding problem one seeks a set of mutually similar subsequences within a collection of biological sequences. This is an important and widely-studied problem, as such shared motifs in DNA often correspond to regulatory elements. We study a combinatorial framework where the goal is to find subsequences of a given length such that the sum of their pairwise distances is minimized. We describe a novel integer linear program for the problem, which uses the fact that; distances between subsequences come from a limited set of possibilities. We show how to tighten its linear programming relaxation by adding an exponential set of constraints and give an efficient separation algorithm that can find violated constraints, thereby showing that the tightened linear program can still be solved in polynomial time. We apply our approach to find optimal solutions for the motif finding problem and show that it is effective in practice in uncovering known transcription factor binding sites. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Kingsford, C., Zaslavsky, E., & Singh, M. (2006). A compact mathematical programming formulation for DNA motif finding. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4009 LNCS, pp. 233–245). Springer Verlag. https://doi.org/10.1007/11780441_22

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free