Using co-occurrence network structure to extract synonymous gene and protein names from MEDLINE abstracts

54Citations
Citations of this article
81Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Text-mining can assist biomedical researchers in reducing information overload by extracting useful knowledge from large collections of text. We developed a novel text-mining method based on analyzing the network structure created by symbol co-occurrences as a way to extend the capabilities of knowledge extraction. The method was applied to the task of automatic gene and protein name synonym extraction. Results: Performance was measured on a test set consisting of about 50,000 abstracts from one year of MEDLINE. Synonyms retrieved from curated genomics databases were used as a gold standard. The system obtained a maximum F-score of 22.21% (23.18% precision and 21.36% recall), with high efficiency in the use of seed pairs. Conclusion: The method performs comparably with other studied methods, does not rely on sophisticated named-entity recognition, and requires little initial seed knowledge. © 2005 Cohen et al; licensee BioMed Central Ltd.

Cite

CITATION STYLE

APA

Cohen, A. M., Hersh, W. R., Dubay, C., & Spackman, K. (2005). Using co-occurrence network structure to extract synonymous gene and protein names from MEDLINE abstracts. BMC Bioinformatics, 6. https://doi.org/10.1186/1471-2105-6-103

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free