MotifOrganizer: A scalable model-based motif clustering tool for mammalian genomes

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

Assembling a comprehensive catalog of all transcription factors (TFs) and the genes that they regulate (regulon) is important for understanding gene regulation. The sequence-specific conserved binding profiles of TFs can be characterized from whole genome sequences with phylogenetic approaches, and a large number of such profiles have been released. Effective mining of these data sources could reveal novel functional elements computationally. Due to the variability of the binding sites, it is necessary to generalize profiles pertinent to the same TF by clustering. The summarized familial profile is effective in identifying unknown binding sites, thus lead to gene co-regulation prediction. Here we report MotifOrganizer, a scalable model-based clustering algorithm designed for grouping motifs identified from large scale comparative genomics studies on mammalian species. The new algorithm allows grouping of motifs with variable widths and a novel two-stage operation scheme further increases the scalability. MotifOrgainzer demonstrated favorable performance comparing to distance-based and single-stage model-based clustering tools on simulated data. Tests on approximately 150k motifs from the cisRED human database demonstrated that MotifOrganizer can effectively cluster whole genome sets of mammalian motifs.

Cite

CITATION STYLE

APA

Qin, Z. S., Bilenky, M., Su, G., & Jones, S. J. M. (2013). MotifOrganizer: A scalable model-based motif clustering tool for mammalian genomes. Frontiers in Bioscience - Elite, 5 E(2), 785–797. https://doi.org/10.2741/e659

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free