Scheduled Dialog Policy Learning: An Automatic Curriculum Learning Framework for Task-oriented Dialog System

Sihong Liu; Jinchao Zhang; Keqing He; Weiran Xu; Jie Zhou

Conference ProceedingsOPEN ACCESS

Scheduled Dialog Policy Learning: An Automatic Curriculum Learning Framework for Task-oriented Dialog System

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (2021) 1091-1102

DOI: 10.18653/v1/2021.findings-acl.94

6Citations

48Readers

Abstract

In reinforcement learning (RL) based task-oriented dialogue systems, users act as the environment and the agent learns the policy by interacting with users. However, due to the subjectivity of different users, the complexity of user-generated training conversations varies greatly, which leads to different difficulties for the agent to learn. Therefore, it is necessary for modeling dialogue complexity and make a reasonable learning schedule for efficiently training the agent. Towards that, we propose Scheduled Dialog Policy Learning, an automatic curriculum learning framework for jointing curriculum learning and policy optimization in the task-oriented dialog system. To our best knowledge, it is the first RL framework that improves dialogue policy learning by scheduling its learning process. Specifically, we introduce an automatic measurement to evaluate the dialogue complexity, and based on this automatic measurement, we train the dialog agent from easy dialogues to complex ones. Experiments demonstrate that our approach can be applied to the task-oriented dialogue policy learning and outperforms the previous state-of-the-art model, which increases 9.6% and 10.0% in the accuracy on the dialog success rate, respectively on the MultiWoz and Movie-Ticket Booking datasets.

Cite

CITATION STYLE

APA

Liu, S., Zhang, J., He, K., Xu, W., & Zhou, J. (2021). Scheduled Dialog Policy Learning: An Automatic Curriculum Learning Framework for Task-oriented Dialog System. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 1091–1102). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.94

Scheduled Dialog Policy Learning: An Automatic Curriculum Learning Framework for Task-oriented Dialog System

Abstract

Cite

Register to see more suggestions