Bayes-adaptive monte-carlo planning and learning for goal-oriented dialogues

Youngsoo Jang; Jongmin Lee; Kee Eung Kim

Conference ProceedingsOPEN ACCESS

Bayes-adaptive monte-carlo planning and learning for goal-oriented dialogues

AAAI 2020 - 34th AAAI Conference on Artificial Intelligence (2020) 7994-8001

DOI: 10.1609/aaai.v34i05.6308

16Citations

30Readers

Abstract

We consider a strategic dialogue task, where the ability to infer the other agent’s goal is critical to the success of the conversational agent. While this problem can be naturally formulated as Bayesian planning, it is known to be a very difficult problem due to its enormous search space consisting of all possible utterances. In this paper, we introduce an efficient Bayes-adaptive planning algorithm for goal-oriented dialogues, which combines RNN-based dialogue generation and MCTS-based Bayesian planning in a novel way, leading to robust decision-making under the uncertainty of the other agent’s goal. We then introduce reinforcement learning for the dialogue agent that uses MCTS as a strong policy improvement operator, casting reinforcement learning as iterative alternation of planning and supervised-learning of self-generated dialogues. In the experiments, we demonstrate that our Bayes-adaptive dialogue planning agent significantly outperforms the state-of-the-art in a negotiation dialogue domain. We also show that reinforcement learning via MCTS further improves end-task performance without diverging from human language.

Cite

CITATION STYLE

APA

Jang, Y., Lee, J., & Kim, K. E. (2020). Bayes-adaptive monte-carlo planning and learning for goal-oriented dialogues. In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence (pp. 7994–8001). AAAI press. https://doi.org/10.1609/aaai.v34i05.6308

Bayes-adaptive monte-carlo planning and learning for goal-oriented dialogues

Abstract

Cite

Register to see more suggestions