Abstract
This work introduces BioLORD, a new pretraining strategy for producing meaningful representations for clinical sentences and biomedical concepts. State-of-the-art methodologies operate by maximizing the similarity in representation of names referring to the same concept, and preventing collapse through contrastive learning. However, because biomedical names are not always self-explanatory, it sometimes results in non-semantic representations. BioLORD overcomes this issue by grounding concept representations using definitions, as well as short descriptions derived from a multi-relational knowledge graph consisting of biomedical ontologies. Thanks to this grounding, our model produces more semantic concept representations that match more closely the hierarchical structure of ontologies. BioLORD establishes a new state of the art for text similarity on both clinical sentences (MedSTS) and biomedical concepts (MayoSRS).
Cite
CITATION STYLE
Remy, F., Demuynck, K., & Demeester, T. (2022). BioLORD: Learning Ontological Representations from Definitions for Biomedical Concepts and their Textual Descriptions. In Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 1454–1465). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-emnlp.249
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.