Multi-Task Training with In-Domain Language Models for Diagnostic Reasoning

2Citations
Citations of this article
24Readers
Mendeley users who have this article in their library.

Abstract

Generative artificial intelligence (AI) is a promising direction for augmenting clinical diagnostic decision support and reducing diagnostic errors, a leading contributor to medical errors. To further the development of clinical AI systems, the Diagnostic Reasoning Benchmark (DR.BENCH) was introduced as a comprehensive generative AI framework, comprised of six tasks representing key components in clinical reasoning. We present a comparative analysis of in-domain versus out-of-domain language models as well as multi-task versus single task training with a focus on the problem summarization task in DR.BENCH (Gao et al., 2023). We demonstrate that a multitask, clinically-trained language model outperforms its general domain counterpart by a large margin, establishing a new state-of-theart performance, with a ROUGE-L score of 28.55. This research underscores the value of domain-specific training for optimizing clinical diagnostic reasoning tasks.

Cite

CITATION STYLE

APA

Sharma, B., Gao, Y., Miller, T., Churpek, M. M., Afshar, M., & Dligach, D. (2023). Multi-Task Training with In-Domain Language Models for Diagnostic Reasoning. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 78–85). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.clinicalnlp-1.10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free