DG2: Data Augmentation Through Document Grounded Dialogue Generation

6Citations
Citations of this article
31Readers
Mendeley users who have this article in their library.

Abstract

Collecting data for training dialog systems can be extremely expensive due to the involvement of human participants and the need for extensive annotation. Especially in document-grounded dialog systems, human experts need to carefully read the unstructured documents to answer the users' questions. As a result, existing document-grounded dialog datasets are relatively small-scale and obstruct the effective training of dialogue systems. In this paper, we propose an automatic data augmentation technique grounded on documents through a generative dialogue model. The dialogue model consists of a user bot and agent bot that can synthesize diverse dialogues given an input document, which are then used to train a downstream model. When supplementing the original dataset, our method achieves significant improvement over traditional data augmentation methods. We also achieve competitive performance in the low-resource setting.

Cite

CITATION STYLE

APA

Wu, Q., Feng, S., Chen, D., Joshi, S., Lastras, L. A., & Yu, Z. (2022). DG2: Data Augmentation Through Document Grounded Dialogue Generation. In SIGDIAL 2022 - 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference (pp. 204–216). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.sigdial-1.21

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free