Finding biomedical categories in Medline®

8Citations
Citations of this article
22Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: There are several humanly defined ontologies relevant to Medline. However, Medline is a fast growing collection of biomedical documents which creates difficulties in updating and expanding these humanly defined ontologies. Automatically identifying meaningful categories of entities in a large text corpus is useful for information extraction, construction of machine learning features, and development of semantic representations. In this paper we describe and compare two methods for automatically learning meaningful biomedical categories in Medline. The first approach is a simple statistical method that uses part-of-speech and frequency information to extract a list of frequent nouns from Medline. The second method implements an alignment-based technique to learn frequent generic patterns that indicate a hyponymy/hypernymy relationship between a pair of noun phrases. We then apply these patterns to Medline to collect frequent hypernyms as potential biomedical categories. Results: We study and compare these two alternative sets of terms to identify semantic categories in Medline. We find that both approaches produce reasonable terms as potential categories. We also find that there is a significant agreement between the two sets of terms. The overlap between the two methods improves our confidence regarding categories predicted by these independent methods. Conclusions: This study is an initial attempt to extract categories that are discussed in Medline. Rather than imposing external ontologies on Medline, our methods allow categories to emerge from the text.

References Powered by Scopus

Gene ontology: Tool for the unification of biology

31575Citations
N/AReaders
Get full text

Multiple sequence alignment with the Clustal series of programs

4219Citations
N/AReaders
Get full text

The Universal Protein Resource (UniProt)

1505Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Automating the generation of lexical patterns for processing free text in clinical documents

7Citations
N/AReaders
Get full text

Medical entities tagging using distant learning

7Citations
N/AReaders
Get full text

Gene-disease-food relation extraction from biomedical database

6Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Yeganova, L., Kim, W., Comeau, D. C., & John Wilbur, W. (2012). Finding biomedical categories in Medline®. Journal of Biomedical Semantics, 3(3). https://doi.org/10.1186/2041-1480-3-S3-S3

Readers' Seniority

Tooltip

Researcher 7

50%

PhD / Post grad / Masters / Doc 4

29%

Professor / Associate Prof. 2

14%

Lecturer / Post doc 1

7%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 8

50%

Computer Science 4

25%

Engineering 2

13%

Biochemistry, Genetics and Molecular Bi... 2

13%

Save time finding and organizing research with Mendeley

Sign up for free