NestedMICA: Sensitive inference of over-represented motifs in nucleic acid sequence

Thomas A. Down; Tim J.P. Hubbard

Journal ArticleOPEN ACCESS

NestedMICA: Sensitive inference of over-represented motifs in nucleic acid sequence

Nucleic Acids Research (2005) 33(5) 1445-1453

DOI: 10.1093/nar/gki282

99Citations

103Readers

Abstract

NestedMICA is a new, scalable, pattern-discovery system for finding transcription factor binding sites and similar motifs in biological sequences. Like several previous methods, NestedMICA tackles this problem by optimizing a probabilistic mixture model to fit a set of sequences. However, the use of a newly developed inference strategy called Nested Sampling means NestedMICA is able to find optimal solutions without the need for a problematic initialization or seeding step. We investigate the performance of NestedMICA in a range scenario, on synthetic data and a well-characterized set of muscle regulatory regions, and compare it with the popular MEME program. We show that the new method is significantly more sensitive than MEME: in one case, it successfully extracted a target motif from background sequence four times longer than could be handled by the existing program. It also performs robustly on synthetic sequences containing multiple significant motifs. When tested on a real set of regulatory sequences, NestedMICA produced motifs which were good predictors for all five abundant classes of annotated binding sites. © The Author 2005. Published by Oxford University Press. All rights reserved.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Down, T. A., & Hubbard, T. J. P. (2005). NestedMICA: Sensitive inference of over-represented motifs in nucleic acid sequence. Nucleic Acids Research, 33(5), 1445–1453. https://doi.org/10.1093/nar/gki282

Readers' Seniority

PhD / Post grad / Masters / Doc 44

51%

Researcher 27

31%

Professor / Associate Prof. 14

16%

Lecturer / Post doc 2

Readers' Discipline

Agricultural and Biological Sciences 57

66%

Computer Science 15

17%

Biochemistry, Genetics and Molecular Bi... 12

14%

Engineering 3

NestedMICA: Sensitive inference of over-represented motifs in nucleic acid sequence

Abstract

References Powered by Scopus

Independent component analysis, A new concept?

Sequence logos: A new way to display consensus sequences

The Ensembl genome database project

Cited by Powered by Scopus

The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells

Origins and functional impact of copy number variation in the human genome

DREME: Motif discovery in transcription factor ChIP-seq data

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline