MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware Medical Dialogue Generation

Wenge Liu; Jianheng Tang; Yi Cheng; Wenjie Li; Yefeng Zheng; Xiaodan Liang

Conference Proceedings

MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware Medical Dialogue Generation

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2022) 13551 LNAI 447-459

DOI: 10.1007/978-3-031-17120-8_35

7Citations

7Readers

Get full text

Abstract

Medical dialogue systems interact with patients to collect symptoms and provide treatment advice. In this task, medical entities (e.g., diseases, symptoms, and medicines) are the most central part of the dialogues. However, existing datasets either do not provide entity annotation or are too small in scale. In this paper, we present MedDG, an entity-centric medical dialogue dataset, where medical entities are annotated with the help of domain experts. It consists of 17,864 Chinese dialogues, 385,951 utterances, and 217,205 entities, at least one magnitude larger than existing entity-annotated datasets. Based on MedDG, we conduct preliminary research on entity-aware medical dialogue generation by implementing several benchmark models. Extensive experiments show that the entity-aware adaptions on the generation models consistently enhance the response quality but there still remains a large space of improvement for future research. The codes and the dataset are released at https://github.com/lwgkzl/MedDG.

Cite

CITATION STYLE

APA

Liu, W., Tang, J., Cheng, Y., Li, W., Zheng, Y., & Liang, X. (2022). MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware Medical Dialogue Generation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13551 LNAI, pp. 447–459). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-17120-8_35

MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware Medical Dialogue Generation

Abstract

Cite

Register to see more suggestions