Context-uncertainty-aware chatbot action selection via parameterized auxiliary reinforcement learning

Chuandong Yin; Rui Zhang; Jianzhong Qi; Yu Sun; Tenglun Tan

Conference Proceedings

Context-uncertainty-aware chatbot action selection via parameterized auxiliary reinforcement learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 10937 LNAI 500-512

DOI: 10.1007/978-3-319-93034-3_40

3Citations

19Readers

Get full text

Abstract

We propose a context-uncertainty-aware chatbot and a reinforcement learning (RL) model to train the chatbot. The proposed model is named Parameterized Auxiliary Asynchronous Advantage Actor Critic (PA4C). We utilize a user simulator to simulate the uncertainty of users’ utterance based on real data. Our PA4C model interacts with simulated users to gradually adapt to different users’ utterance confidence in a conversation context. Compared with naive rule-based approaches, our chatbot trained via the PA4C model avoids hand-crafted action selection and is more robust to user utterance variance. The PA4C model optimizes conventional RL models with action parameterization and auxiliary tasks for chatbot training, which address the problems of a large action space and zero-reward states. We evaluate the PA4C model over training a chatbot for calendar event creation tasks. Experimental results show that our model outperforms the state-of-the-art RL models in terms of success rate, dialogue length, and episode reward.

Cite

CITATION STYLE

APA

Yin, C., Zhang, R., Qi, J., Sun, Y., & Tan, T. (2018). Context-uncertainty-aware chatbot action selection via parameterized auxiliary reinforcement learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10937 LNAI, pp. 500–512). Springer Verlag. https://doi.org/10.1007/978-3-319-93034-3_40

Context-uncertainty-aware chatbot action selection via parameterized auxiliary reinforcement learning

Abstract

Cite

Register to see more suggestions