Research on Wargame Decision-Making Method Based on Multi-Agent Deep Deterministic Policy Gradient

4Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

Wargames are essential simulators for various war scenarios. However, the increasing pace of warfare has rendered traditional wargame decision-making methods inadequate. To address this challenge, wargame-assisted decision-making methods that leverage artificial intelligence techniques, notably reinforcement learning, have emerged as a promising solution. The current wargame environment is beset by a large decision space and sparse rewards, presenting obstacles to optimizing decision-making methods. To overcome these hurdles, a Multi-Agent Deep Deterministic Policy Gradient (MADDPG) based wargame decision-making method is presented. The Partially Observable Markov Decision Process (POMDP), joint action-value function, and the Gumbel-Softmax estimator are applied to optimize MADDPG in order to adapt to the wargame environment. Furthermore, a wargame decision-making method based on the improved MADDPG algorithm is proposed. Using supervised learning in the proposed approach, the training efficiency is improved and the space for manipulation before the reinforcement learning phase is reduced. In addition, a policy gradient estimator is incorporated to reduce the action space and to obtain the global optimal solution. Furthermore, an additional reward function is designed to address the sparse reward problem. The experimental results demonstrate that our proposed wargame decision-making method outperforms the pre-optimization algorithm and other algorithms based on the AC framework in the wargame environment. Our approach offers a promising solution to the challenging problem of decision-making in wargame scenarios, particularly given the increasing speed and complexity of modern warfare.

References Powered by Scopus

Extreme learning machine: Theory and applications

12060Citations
N/AReaders
Get full text

A comprehensive survey of multiagent reinforcement learning

1725Citations
N/AReaders
Get full text

Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework

1395Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Short-term load forecasting by GRU neural network and DDPG algorithm for adaptive optimization of hyperparameters

4Citations
N/AReaders
Get full text

A Bayesian network approach for dynamic behavior analysis: Real-time intention recognition

0Citations
N/AReaders
Get full text

HDMTK: Full Integration of Hierarchical Decision-Making and Tactical Knowledge in Multi-Agent Adversarial Games

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Yu, S., Zhu, W., & Wang, Y. (2023). Research on Wargame Decision-Making Method Based on Multi-Agent Deep Deterministic Policy Gradient. Applied Sciences (Switzerland), 13(7). https://doi.org/10.3390/app13074569

Readers' Seniority

Tooltip

Researcher 2

100%

Readers' Discipline

Tooltip

Social Sciences 1

50%

Engineering 1

50%

Article Metrics

Tooltip
Mentions
Blog Mentions: 2
News Mentions: 1
Social Media
Shares, Likes & Comments: 9

Save time finding and organizing research with Mendeley

Sign up for free