Abstract
To address the challenge of policy learning in open-domain multi-turn conversation, we propose to represent prior information about dialog transitions as a graph and learn a graph grounded dialog policy, aimed at fostering a more coherent and controllable dialog. To this end, we first construct a conversational graph (CG) from dialog corpora, in which there are vertices to represent “what to say” and “how to say”, and edges to represent natural transition between a message (the last utterance in a dialog context) and its response. We then present a novel CG grounded policy learning framework that conducts dialog flow planning by graph traversal, which learns to identify a what-vertex and a how-vertex from the CG at each turn to guide response generation. In this way, we effectively leverage the CG to facilitate policy learning as follows: (1) it enables more effective long-term reward design, (2) it provides high-quality candidate actions, and (3) it gives us more control over the policy. Results on two benchmark corpora demonstrate the effectiveness of this framework.
Cite
CITATION STYLE
Xu, J., Wang, H., Niu, Z. Y., Wu, H., Che, W., & Liu, T. (2020). Conversational graph grounded policy learning for open-domain conversation generation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 1835–1845). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.acl-main.166
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.