Learning from dialogue after deployment: Feed yourself, Chatbot!

68Citations
Citations of this article
542Readers
Mendeley users who have this article in their library.

Abstract

The majority of conversations a dialogue agent sees over its lifetime occur after it has already been trained and deployed, leaving a vast store of potential training signal untapped. In this work, we propose the self-feeding chatbot, a dialogue agent with the ability to extract new training examples from the conversations it participates in. As our agent engages in conversation, it also estimates user satisfaction in its responses. When the conversation appears to be going well, the user's responses become new training examples to imitate. When the agent believes it has made a mistake, it asks for feedback; learning to predict the feedback that will be given improves the chatbot's dialogue abilities further. On the PERSONACHAT chitchat dataset with over 131k training examples, we find that learning from dialogue with a self-feeding chatbot significantly improves performance, regardless of the amount of traditional supervision.

Cite

CITATION STYLE

APA

Hancock, B., Bordes, A., Mazaré, P. E., & Weston, J. (2020). Learning from dialogue after deployment: Feed yourself, Chatbot! In ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 3667–3684). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p19-1358

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free