Hybrid learning for multi-agent cooperation with sub-optimal demonstrations

Peixi Peng; Junliang Xing; Lili Cao

Conference Proceedings

Hybrid learning for multi-agent cooperation with sub-optimal demonstrations

IJCAI International Joint Conference on Artificial Intelligence (2020) 2021-January 3037-3043

DOI: 10.24963/ijcai.2020/420

7Citations

13Readers

Get full text

Abstract

This paper aims to learn multi-agent cooperation where each agent performs its actions in a decentralized way. In this case, it is very challenging to learn decentralized policies when the rewards are global and sparse. Recently, learning from demonstrations (LfD) provides a promising way to handle this challenge. However, in many practical tasks, the available demonstrations are often sub-optimal. To learn better policies from these sub-optimal demonstrations, this paper follows a centralized learning and decentralized execution framework and proposes a novel hybrid learning method based on multi-agent actor-critic. At first, the expert trajectory returns generated from demonstration actions are used to pre-train the centralized critic network. Then, multi-agent decisions are made by best response dynamics based on the critic and used to train the decentralized actor networks. Finally, the demonstrations are updated by the actor networks, and the critic and actor networks are learned jointly by running the above two steps alliteratively. We evaluate the proposed approach on a real-time strategy combat game. Experimental results show that the approach outperforms many competing demonstration-based methods.

Cite

CITATION STYLE

APA

Peng, P., Xing, J., & Cao, L. (2020). Hybrid learning for multi-agent cooperation with sub-optimal demonstrations. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2021-January, pp. 3037–3043). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2020/420

Hybrid learning for multi-agent cooperation with sub-optimal demonstrations

Abstract

Cite

Register to see more suggestions