A Collaborative Multi-agent Reinforcement Learning Framework for Dialog Action Decomposition

Huimin Wang; Kam Fai Wong

Conference ProceedingsOPEN ACCESS

A Collaborative Multi-agent Reinforcement Learning Framework for Dialog Action Decomposition

EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings (2021) 7882-7889

DOI: 10.18653/v1/2021.emnlp-main.621

16Citations

67Readers

Abstract

Most reinforcement learning methods for dialog policy learning train a centralized agent that selects a predefined joint action concatenating domain name, intent type, and slot name. The centralized dialog agent suffers from a great many user-agent interaction requirements due to the large action space. Besides, designing the concatenated actions is laborious to engineers and maybe struggled with edge cases. To solve these problems, we model the dialog policy learning problem with a novel multi-agent framework, in which each part of the action is led by a different agent. The framework reduces labor costs for action templates and decreases the size of the action space for each agent. Furthermore, we relieve the non-stationary problem caused by the changing dynamics of the environment as evolving of agents' policies by introducing a joint optimization process that makes agents can exchange their policy information. Concurrently, an independent experience replay buffer mechanism is integrated to reduce the dependence between gradients of samples to improve training efficiency. The effectiveness of the proposed framework is demonstrated in a multi-domain environment with both user simulator evaluation and human evaluation.

Cite

CITATION STYLE

APA

Wang, H., & Wong, K. F. (2021). A Collaborative Multi-agent Reinforcement Learning Framework for Dialog Action Decomposition. In EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 7882–7889). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.emnlp-main.621

A Collaborative Multi-agent Reinforcement Learning Framework for Dialog Action Decomposition

Abstract

Cite

Register to see more suggestions