Data-Efficient MADDPG Based on Self-Attention for IoT Energy Management Systems

Mohammed Al-Saffar; Mustafa Gul

Journal ArticleOPEN ACCESS

Data-Efficient MADDPG Based on Self-Attention for IoT Energy Management Systems

IEEE Access (2023) 11 109379-109389

DOI: 10.1109/ACCESS.2023.3322193

3Citations

13Readers

Abstract

In this study, the simulated real-world Demand Response (DR) potential is controlled and optimized when household load characteristics are analyzed based on historical data information. To determine the optimal DR potential in smart homes integrated with IoT energy management systems, a multi-agent reinforcement learning framework can be one of the best solutions to handle various household appliances' control activities associated with stochastic nature. However, the main problem with multi-agent systems is a nonstationary environment that is arisen by the agents. Consequently, this can cause more system uncertainties. Hence, it requires an excessive number of interactions with the environment for training which leads to a data inefficient reinforcement learning model. Thus, we propose a new approach using a Multi-Agent Deep Deterministic Policy Gradient based on Bi-directional Long Short Term Memory and Attention Mechanism (BiLSTMA-MADDPG) to extract more useful information. Therefore, we developed an improved MADDPG model that exploits the BiLSTM layer to store a history of experience in the MADDPG's replay buffer, and the Attention Mechanism to reduce the model dependency upon the number of samples since it can extract the most valuable data and ignore the less important ones. In this way, BiLSTMA-MADDPG can perform better than the conventional MADDPG even with the small sample environment to motivate the exploration of a more robust and data-efficient regime. Therefore, the attention mechanism enables MADDPG to be more effective and scalable in learning in complex real-world multi-agent environments. Simulation results are obtained for a household environment with three cooperated agents to control the following devices, washing machine, air conditioner, and electric vehicle. The model performance is validated, showing an improvement to the data efficiency and convergence speed, and a promise for a real-life application in terms of appliance energy consumption.

Author supplied keywords

Cite

CITATION STYLE

APA

Al-Saffar, M., & Gul, M. (2023). Data-Efficient MADDPG Based on Self-Attention for IoT Energy Management Systems. IEEE Access, 11, 109379–109389. https://doi.org/10.1109/ACCESS.2023.3322193

Data-Efficient MADDPG Based on Self-Attention for IoT Energy Management Systems

Abstract

Author supplied keywords

Cite

Register to see more suggestions