In this paper, we present our system for the textual entailment identification task as a subtask of the SemEval-2023 Task 7: Multi-evidence Natural Language Inference for Clinical Trial Data. The entailment identification task aims to determine whether a medical statement affirms a valid entailment given a clinical trial premise or forms a contradiction with it. Since the task is inherently a text classification task, we propose a system that performs binary classification given a statement and its associated clinical trial. Our proposed system leverages a human-defined prompt to aggregate the information contained in the statement, section name, and clinical trials. Pre-trained language models are then finetuned on the prompted input sentences to learn to discriminate the inferential relation between the statement and clinical trial. To validate our system, we conduct extensive experiments with a wide variety of pre-trained language models. Our best system is built on DeBERTa-v3-large, which achieves an F1 score of 0.764 and secures the fifth rank in the official leaderboard. Further analysis indicates that leveraging our designed prompt is effective, and our model suffers from a low recall. Our code and pre-trained models are available at https://github.com/HKUSTKnowComp/NLI4CT.
CITATION STYLE
Wang, W., Xu, B., Fang, T., Zhang, L., & Song, Y. (2023). KnowComp at SemEval-2023 Task 7: Fine-tuning Pre-trained Language Models for Clinical Trial Entailment Identification. In 17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop (pp. 1–9). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.semeval-1.1
Mendeley helps you to discover research relevant for your work.