Weighted double deep multiagent reinforcement learning in stochastic cooperative environments

Yan Zheng; Zhaopeng Meng; Jianye Hao; Zongzhang Zhang

Conference Proceedings

Weighted double deep multiagent reinforcement learning in stochastic cooperative environments

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11013 LNAI 421-429

DOI: 10.1007/978-3-319-97310-4_48

31Citations

74Readers

Get full text

Abstract

Recently, multiagent deep reinforcement learning (DRL) has received increasingly wide attention. Existing multiagent DRL algorithms are inefficient when faced with the non-stationarity due to agents update their policies simultaneously in stochastic cooperative environments. This paper extends the recently proposed weighted double estimator to the multiagent domain and propose a multiagent DRL framework, named weighted double deep Q-network (WDDQN). By utilizing the weighted double estimator and the deep neural network, WDDQN can not only reduce the bias effectively but also be extended to scenarios with raw visual inputs. To achieve efficient cooperation in the multiagent domain, we introduce the lenient reward network and the scheduled replay strategy. Experiments show that WDDQN outperforms the existing DRL and multiagent DRL algorithms, i.e., double DQN and lenient Q-learning, in terms of the average reward and the convergence rate in stochastic cooperative environments.

Author supplied keywords

Cite

CITATION STYLE

APA

Zheng, Y., Meng, Z., Hao, J., & Zhang, Z. (2018). Weighted double deep multiagent reinforcement learning in stochastic cooperative environments. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11013 LNAI, pp. 421–429). Springer Verlag. https://doi.org/10.1007/978-3-319-97310-4_48

Weighted double deep multiagent reinforcement learning in stochastic cooperative environments

Abstract

Author supplied keywords

Cite

Register to see more suggestions