BioLORD: Learning Ontological Representations from Definitions for Biomedical Concepts and their Textual Descriptions

François Remy; Kris Demuynck; Thomas Demeester

Conference Proceedings

BioLORD: Learning Ontological Representations from Definitions for Biomedical Concepts and their Textual Descriptions

Findings of the Association for Computational Linguistics: EMNLP 2022 (2022) 1454-1465

DOI: 10.18653/v1/2022.findings-emnlp.249

12Citations

42Readers

Get full text

Abstract

This work introduces BioLORD, a new pretraining strategy for producing meaningful representations for clinical sentences and biomedical concepts. State-of-the-art methodologies operate by maximizing the similarity in representation of names referring to the same concept, and preventing collapse through contrastive learning. However, because biomedical names are not always self-explanatory, it sometimes results in non-semantic representations. BioLORD overcomes this issue by grounding concept representations using definitions, as well as short descriptions derived from a multi-relational knowledge graph consisting of biomedical ontologies. Thanks to this grounding, our model produces more semantic concept representations that match more closely the hierarchical structure of ontologies. BioLORD establishes a new state of the art for text similarity on both clinical sentences (MedSTS) and biomedical concepts (MayoSRS).

Cite

CITATION STYLE

APA

Remy, F., Demuynck, K., & Demeester, T. (2022). BioLORD: Learning Ontological Representations from Definitions for Biomedical Concepts and their Textual Descriptions. In Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 1454–1465). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-emnlp.249

BioLORD: Learning Ontological Representations from Definitions for Biomedical Concepts and their Textual Descriptions

Abstract

Cite

Register to see more suggestions