Abstract
Biomedical synonyms are important resources for Natural Language Processing in Biomedical domain. Existing synonym resources (e.g., the UMLS) are not complete. Manual efforts for expanding and enriching these resources are prohibitively expensive. We therefore develop and evaluate approaches for automated synonym extraction from Wikipedia. Using the inter-wiki links, we extracted the candidate synonyms (anchor-text e.g., “increased thirst”) in a Wikipedia page and the title (e.g., “polyuria”) of its corresponding linked page. We rank synonym candidates with word embedding and pseudo-relevance feedback (PRF). Our results show that PRF-based re-ranking outperformed word embedding based approach and a strong baseline using inter-wiki link frequency. A hybrid method, Rank Score Combination, achieved the best results. Our analysis also suggests that medical synonyms mined from Wikipedia can increase the coverage of existing synonym resources such as UMLS.
Cite
CITATION STYLE
Jagannatha, A. N., Chen, J., & Yu, H. (2015). Mining and Ranking Biomedical Synonym Candidates from Wikipedia. In EMNLP 2015 - 6th International Workshop on Health Text Mining and Information Analysis, LOUHI 2015 - Proceedings of the Workshop (pp. 142–151). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-2619
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.