Actor-double-critic: Incorporating model-based critic for task-oriented dialogue systems

8Citations
Citations of this article
69Readers
Mendeley users who have this article in their library.

Abstract

In order to improve the sample-efficiency of deep reinforcement learning (DRL), we implemented imagination augmented agent (I2A) in spoken dialogue systems (SDS). Although I2A achieves a higher success rate than baselines by augmenting predicted future into a policy network, its complicated architecture introduces unwanted instability. In this work, we propose actor-double-critic (ADC) to improve the stability and overall performance of I2A. ADC simplifies the architecture of I2A to reduce excessive parameters and hyper-parameters. More importantly, a separate model-based critic shares parameters between actions and makes back-propagation explicit. In our experiments on Cambridge Restaurant Booking task, ADC enhances success rates considerably and shows robustness to imperfect environment models. In addition, ADC exhibits the stability and sample-efficiency as significantly reducing the baseline standard deviation of success rates and reaching the 80% success rate with half training data.

Cite

CITATION STYLE

APA

Wu, Y. C., Tseng, B. H., & Gašić, M. (2020). Actor-double-critic: Incorporating model-based critic for task-oriented dialogue systems. In Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020 (pp. 854–863). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.findings-emnlp.75

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free