Predictive Engagement: An Efficient Metric for Automatic Evaluation of Open-Domain Dialogue Systems

31Citations
Citations of this article
63Readers
Mendeley users who have this article in their library.

Abstract

User engagement is a critical metric for evaluating the quality of open-domain dialogue systems. Prior work has focused on conversation-level engagement by using heuristically constructed features such as the number of turns and the total time of the conversation. In this paper, we investigate the possibility and efficacy of estimating utterance-level engagement and define a novel metric, predictive engagement, for automatic evaluation of open-domain dialogue systems. Our experiments demonstrate that (1) human annotators have high agreement on assessing utterance-level engagement scores; (2) conversation-level engagement scores can be predicted from properly aggregated utterance-level engagement scores. Furthermore, we show that the utterance-level engagement scores can be learned from data. These scores can be incorporated into automatic evaluation metrics for open-domain dialogue systems to improve the correlation with human judgements. This suggests that predictive engagement can be used as a real-Time feedback for training better dialogue models.

Cite

CITATION STYLE

APA

Ghazarian, S., Weischedel, R., Galstyan, A., & Peng, N. (2020). Predictive Engagement: An Efficient Metric for Automatic Evaluation of Open-Domain Dialogue Systems. In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence (pp. 7789–7796). AAAI press. https://doi.org/10.1609/aaai.v34i05.6283

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free