Text-mining assisted regulatory annotation

Stein Aerts; Maximilian Haeussler; Steven van Vooren; Obi L. Griffith; Paco Hulpiau; Steven J.M. Jones; Stephen B. Montgomery; Casey M. Bergman

Journal ArticleOPEN ACCESS

Text-mining assisted regulatory annotation

Genome Biology (2008) 9(2)

DOI: 10.1186/gb-2008-9-2-r31

27Citations

80Readers

Abstract

Background: Decoding transcriptional regulatory networks and the genomic cis-regulatory logic implemented in their control nodes is a fundamental challenge in genome biology. High-throughput computational and experimental analyses of regulatory networks and sequences rely heavily on positive control data from prior small-scale experiments, but the vast majority of previously discovered regulatory data remains locked in the biomedical literature. Results: We develop text-mining strategies to identify relevant publications and extract sequence information to assist the regulatory annotation process. Using a vector space model to identify Medline abstracts from papers likely to have high cis-regulatory content, we demonstrate that document relevance ranking can assist the curation of transcriptional regulatory networks and estimate that, minimally, 30,000 papers harbor unannotated cis-regulatory data. In addition, we show that DNA sequences can be extracted from primary text with high cis-regulatory content and mapped to genome sequences as a means of identifying the location, organism and target gene information that is critical to the cis-regulatory annotation process. Conclusion: Our results demonstrate that text-mining technologies can be successfully integrated with genome annotation systems, thereby increasing the availability of annotated cis-regulatory data needed to catalyze advances in the field of gene regulation. © 2008 Aerts et al.; licensee BioMed Central Ltd.

Cite

CITATION STYLE

APA

Aerts, S., Haeussler, M., van Vooren, S., Griffith, O. L., Hulpiau, P., Jones, S. J. M., … Bergman, C. M. (2008). Text-mining assisted regulatory annotation. Genome Biology, 9(2). https://doi.org/10.1186/gb-2008-9-2-r31

Text-mining assisted regulatory annotation

Abstract

Cite

Register to see more suggestions