emrKBQA: A Clinical Knowledge-Base Question Answering Dataset

19Citations
Citations of this article
70Readers
Mendeley users who have this article in their library.

Abstract

We present emrKBQA, a dataset for answering physician questions from a structured patient record. It consists of questions, logical forms and answers. The questions and logical forms are generated based on real-world physician questions and are slot-filled and answered from patients in the MIMIC-III KB (Johnson et al., 2016) through a semi-automated process. This community-shared release consists of over 940000 question, logical form and answer triplets with 389 types of questions and ≈7.5 paraphrases per question type. We perform experiments to validate the quality of the dataset and set benchmarks for question to logical form learning that helps answer questions on this dataset.

Cite

CITATION STYLE

APA

Raghavan, P., Mahajan, D., Liang, J., Chandra, R., & Szolovits, P. (2021). emrKBQA: A Clinical Knowledge-Base Question Answering Dataset. In Proceedings of the 20th Workshop on Biomedical Language Processing, BioNLP 2021 (pp. 64–73). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.bionlp-1.7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free