MTMA-DDPG: A Deep Deterministic Policy Gradient Reinforcement Learning for Multi-task Multi-agent Environments

1Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Beyond Multi-Task or Multi-Agent learning, we develop in this work a multi-agent reinforcement learning algorithm to handle a multi-task environments. Our proposed algorithm, Multi-Task Multi-Agent Deep Deterministic Policy gradient, (MTMA-DDPG) (Code available at https://gitlab.com/awadailab/mtmaddpg ), extends its single task counterpart by running multiple tasks on distributed nodes and communicating parameters via pre-determined coefficients across the nodes. Parameter sharing is modulated through temporal decay of the communication coefficients. Training across nodes is parallelized without any centralized controller for different tasks, which opens horizons for flexible leveraging and parallel processing to improve MA learning. Empirically, we design different MA particle environments, where tasks are similar or heterogeneous. We study the performance of MTMA-DDPG in terms of reward, convergence, variance, and communication overhead. We demonstrate the improvement of our algorithm over its single-task counterpart, as well as the importance of a versatile technique to take advantage of parallel computing resources.

Cite

CITATION STYLE

APA

Hamadeh, K., El Zini, J., Hajar, J., & Awad, M. (2022). MTMA-DDPG: A Deep Deterministic Policy Gradient Reinforcement Learning for Multi-task Multi-agent Environments. In IFIP Advances in Information and Communication Technology (Vol. 646 IFIP, pp. 270–281). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-08333-4_22

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free