The ultimate goal of military intelligence is to equip the command and control (C2) system with the decision-making art of excellent human commanders and to be more agile and stable than human beings. Intelligent commander Alpha C2 solves the dynamic decision-making problem in the complex scenarios of air defense operations using a deep reinforcement learning framework. Unlike traditional C2 systems that rely on expert rules and decision-making models, Alpha C2 interacts with digital battlefields close to the real world and generates learning data. By integrating the states of multiple parties as input, a gated recurrent unit network is used to introduce historical information, and an attention mechanism selects the object of action, making the output decision more reliable. Without learning human combat experience, the neural network is trained in fixed- and random-strategy scenarios based on a proximal policy optimization algorithm. Finally, 1,000 rounds of offline confrontation were conducted on a digital battlefield, whose results show that the generalization ability of Alpha C2 trained using a random strategy is better, and that it can defeat an opponent with a higher winning rate than an Expert C2 system (72% vs 21%). The use of resources is more reasonable than Expert C2, reflecting the flexible and changeable art of command.
CITATION STYLE
Fu, Q., Fan, C. L., Song, Y., & Guo, X. K. (2020). Alpha C2-An Intelligent Air Defense Commander Independent of Human Decision-Making. IEEE Access, 8, 87504–87516. https://doi.org/10.1109/ACCESS.2020.2993459
Mendeley helps you to discover research relevant for your work.