Investigating deep reinforcement learning techniques in personalized dialogue generation

Min Yang; Qiang Qu; Kai Lei; Jia Zhu; Zhou Zhao; Xiaojun Chen; Joshua Z. Huang

Conference Proceedings

Investigating deep reinforcement learning techniques in personalized dialogue generation

SIAM International Conference on Data Mining, SDM 2018 (2018) 630-638

DOI: 10.1137/1.9781611975321.71

20Citations

35Readers

Get full text

Abstract

In this paper, we propose a personalized dialogue generation system, which combines reinforcement learning techniques with an attention-based hierarchical recurrent encoder-decoder model. Firstly, we incorporate user-specific information into the decoder to capture user's background information and speaking style. Secondly, we employ reinforcement learning techniques to maximize future reward in dialogue, which enables our system to generate topic-coherent, informative and grammatical responses. Moreover, we propose three types of rewards to characterize good conversations. Finally, we compare the performance of the following reinforcement learning methods in dialogue generation: policy gradient, Q-learning, and actor-critic algorithms. We conduct experiments to verify the effectiveness of the proposed model on two dialogue datasets. Experimental results demonstrate that our model can generate better personalized dialogues for different users. Quantitatively, our method achieves better performance than the state-of-the-art dialogue systems in terms of BLEU score, perplexity, and human evaluation.

Author supplied keywords

Cite

CITATION STYLE

APA

Yang, M., Qu, Q., Lei, K., Zhu, J., Zhao, Z., Chen, X., & Huang, J. Z. (2018). Investigating deep reinforcement learning techniques in personalized dialogue generation. In SIAM International Conference on Data Mining, SDM 2018 (pp. 630–638). Society for Industrial and Applied Mathematics Publications. https://doi.org/10.1137/1.9781611975321.71

Investigating deep reinforcement learning techniques in personalized dialogue generation

Abstract

Author supplied keywords

Cite

Register to see more suggestions