Extracting regulatory gene expression networks from PubMed

Jasmin Šarić; Lars J. Jensen; Rossitza Ouzounova; Isabel Rojas; Peer Bork

Conference ProceedingsOPEN ACCESS

Extracting regulatory gene expression networks from PubMed

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2004) 191-198

DOI: 10.3115/1218955.1218980

22Citations

85Readers

Abstract

We present an approach using syntactosemantic rules for the extraction of relational information from biomedical abstracts. The results show that by overcoming the hurdle of technical terminology, high precision results can be achieved. From abstracts related to baker’s yeast, we manage to extract a regulatory network comprised of 441 pairwise relations from 58,664 abstracts with an accuracy of 83–90%. To achieve this, we made use of a resource of gene/protein names considerably larger than those used in most other biology related information extraction approaches. This list of names was included in the lexicon of our retrained part-of-speech tagger for use on molecular biology abstracts. For the domain in question an accuracy of 93.6–97.7% was attained on POS-tags. The method is easily adapted to other organisms than yeast, allowing us to extract many more biologically relevant relations.

Cite

CITATION STYLE

APA

Šarić, J., Jensen, L. J., Ouzounova, R., Rojas, I., & Bork, P. (2004). Extracting regulatory gene expression networks from PubMed. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 191–198). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1218955.1218980

Extracting regulatory gene expression networks from PubMed

Abstract

Cite

Register to see more suggestions