In this article, we address the problem of modeling generic features of structurally but not textually related DNA motifs, that is, motifs whose consensus sequences are entirely different but nevertheless share "metasequence features" reflecting similarities in the DNA-binding domains of their associated protein recognizers. We present MotifPrototyper, a profile Bayesian model that can capture structural properties typical of particular families of motifs. Each family corresponds to transcription regulatory proteins with similar types of structural signatures in their DNA-binding domains. We show how to train MotifPrototypers from biologically identified motifs categorized according to the TRANSFAC categorization of transcription factors and present empirical results of motif classification, motif parameter estimation, and de novo motif detection by using the learned profile models.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below