Human gene name normalization using text matching with automatically extracted synonym dictionaries

Haw Ren Fang; Kevin Murphy; Yang Jin; Jessica S. Kim; Peter S. White

Conference Proceedings

Human gene name normalization using text matching with automatically extracted synonym dictionaries

HLT-NAACL 2006 - BioNLP 2006: Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis, Proceedings of the Workshop (2006) 41-48

DOI: 10.3115/1567619.1567627

31Citations

100Readers

Get full text

Abstract

The identification of genes in biomedical text typically consists of two stages: identifying gene mentions and normalization of gene names. We have created an automated process that takes the output of named entity recognition (NER) systems designed to identify genes and normalizes them to standard referents. The system identifies human gene synonyms from online databases to generate an extensive synonym lexicon. The lexicon is then compared to a list of candidate gene mentions using various string transformations that can be applied and chained in a flexible order, followed by exact string matching or approximate string matching. Using a gold standard of MEDLINE abstracts manually tagged and normalized for mentions of human genes, a combined tagging and normalization system achieved 0.669 F-measure (0.718 precision and 0.626 recall) at the mention level, and 0.901 F-measure (0.957 precision and 0.857 recall) at the document level for documents used for tagger training.

Cite

CITATION STYLE

APA

Fang, H. R., Murphy, K., Jin, Y., Kim, J. S., & White, P. S. (2006). Human gene name normalization using text matching with automatically extracted synonym dictionaries. In HLT-NAACL 2006 - BioNLP 2006: Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis, Proceedings of the Workshop (pp. 41–48). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1567619.1567627

Human gene name normalization using text matching with automatically extracted synonym dictionaries

Abstract

Cite

Register to see more suggestions