doc2dial: A goal-oriented document-grounded dialogue dataset

Song Feng; Hui Wan; Chulaka Gunasekara; Siva Sankalp Patel; Sachindra Joshi; Luis A. Lastras

Conference ProceedingsOPEN ACCESS

doc2dial: A goal-oriented document-grounded dialogue dataset

EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (2020) 8118-8128

DOI: 10.18653/v1/2020.emnlp-main.652

89Citations

110Readers

Abstract

We introduce doc2dial, a new dataset of goal-oriented dialogues that are grounded in the associated documents. Inspired by how the authors compose documents for guiding end users, we first construct dialogue flows based on the content elements that corresponds to higher-level relations across text sections as well as lower-level relations between discourse units within a section. Then we present these dialogue flows to crowd contributors to create conversational utterances. The dataset includes over 4500 annotated conversations with an average of 14 turns that are grounded in over 450 documents from four domains. Compared to the prior document-grounded dialogue datasets, this dataset covers a variety of dialogue scenes in information-seeking conversations. For evaluating the versatility of the dataset, we introduce multiple dialogue modeling tasks and present baseline approaches.

Cite

CITATION STYLE

APA

Feng, S., Wan, H., Gunasekara, C., Patel, S. S., Joshi, S., & Lastras, L. A. (2020). doc2dial: A goal-oriented document-grounded dialogue dataset. In EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 8118–8128). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.emnlp-main.652

doc2dial: A goal-oriented document-grounded dialogue dataset

Abstract

Cite

Register to see more suggestions