Abstract
Reinforcement Learning (RL) agents often encounter the bottleneck of the performance when the dilemma of exploration and exploitation arises. In this study, an adaptive exploration strategy with multi-attribute decision-making is proposed to address the trade-off problem between exploration and exploitation. Firstly, the proposed method decomposes a complex task into several sub-tasks and trains each sub-task using the same training method individually. Then, the proposed method uses a multi-attribute decision-making method to develop an action policy integrating the training results of these trained sub-tasks. There are practical advantages to improve learning performance by allowing multiple learners to learn in parallel. An adaptive exploration strategy determines the probability of exploration depending on the information entropy instead of the suffocating work of empirical tuning. Finally, transfer learning extends the applicability of the proposed method. The experiment of the robot migration, the robot confrontation, and the real wheeled mobile robot are used to demonstrate the availability and practicability of the proposed method.
Author supplied keywords
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.
Cite
CITATION STYLE
Hu, C., & Xu, M. (2020). Adaptive Exploration Strategy with Multi-Attribute Decision-Making for Reinforcement Learning. IEEE Access, 8, 32353–32364. https://doi.org/10.1109/ACCESS.2020.2973169