Completing explorer games with a deep reinforcement learning framework based on behavior angle navigation

5Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

Abstract

In cognitive electronic warfare, when a typical combat vehicle, such as an unmanned combat air vehicle (UCAV), uses radar sensors to explore an unknown space, the target-searching fails due to an inefficient servoing/tracking system. Thus, to solve this problem, we developed an autonomous reasoning search method that can generate efficient decision-making actions and guide the UCAV as early as possible to the target area. For high-dimensional continuous action space, the UCAV’s maneuvering strategies are subject to certain physical constraints. We first record the path histories of the UCAV as a sample set of supervised experiments and then construct a grid cell network using long short-term memory (LSTM) to generate a new displacement prediction to replace the target location estimation. Finally, we enable a variety of continuous-control-based deep reinforcement learning algorithms to output optimal/sub-optimal decision-making actions. All these tasks are performed in a three-dimensional target-searching simulator, i.e., the Explorer game. Please note that we use the behavior angle (BHA) for the first time as the main factor of the reward-shaping of the deep reinforcement learning framework and successfully make the trained UCAV achieve a 99.96% target destruction rate, i.e., the game win rate, in a 0.1 s operating cycle.

Cite

CITATION STYLE

APA

You, S., Diao, M., & Gao, L. (2019). Completing explorer games with a deep reinforcement learning framework based on behavior angle navigation. Electronics (Switzerland), 8(5). https://doi.org/10.3390/electronics8050576

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free