Improving Automated Evaluation of Open Domain Dialog via Diverse Reference Augmentation

Varun Gangal; Harsh Jhamtani; Eduard Hovy; Taylor Berg-Kirkpatrick

Conference ProceedingsOPEN ACCESS

Improving Automated Evaluation of Open Domain Dialog via Diverse Reference Augmentation

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (2021) 4079-4090

DOI: 10.18653/v1/2021.findings-acl.357

5Citations

60Readers

Abstract

Multiple different responses are often plausible for a given open domain dialog context. Prior work has shown the importance of having multiple valid reference responses for meaningful and robust automated evaluations. In such cases, common practice has been to collect more human written references. However, such collection can be expensive, time consuming, and not easily scalable. Instead, we propose a novel technique for automatically expanding a human generated reference to a set of candidate references. We fetch plausible references from knowledge sources, and adapt them so that they are more fluent in context of the dialog instance in question. More specifically, we use (1) a commonsense knowledge base to elicit a large number of plausible reactions given the dialog history (2) relevant instances retrieved from dialog corpus, using similar past as well as future contexts. We demonstrate that our automatically expanded reference sets lead to large improvements in correlations of automated metrics with human ratings of system outputs for DailyDialog dataset.

Cite

CITATION STYLE

APA

Gangal, V., Jhamtani, H., Hovy, E., & Berg-Kirkpatrick, T. (2021). Improving Automated Evaluation of Open Domain Dialog via Diverse Reference Augmentation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 4079–4090). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.357

Improving Automated Evaluation of Open Domain Dialog via Diverse Reference Augmentation

Abstract

Cite

Register to see more suggestions