Adaptive Exploration Strategy with Multi-Attribute Decision-Making for Reinforcement Learning

Chunyang Hu; Meng Xu

Journal ArticleOPEN ACCESS

Adaptive Exploration Strategy with Multi-Attribute Decision-Making for Reinforcement Learning

IEEE Access (2020) 8 32353-32364

DOI: 10.1109/ACCESS.2020.2973169

16Citations

27Readers

Abstract

Reinforcement Learning (RL) agents often encounter the bottleneck of the performance when the dilemma of exploration and exploitation arises. In this study, an adaptive exploration strategy with multi-attribute decision-making is proposed to address the trade-off problem between exploration and exploitation. Firstly, the proposed method decomposes a complex task into several sub-tasks and trains each sub-task using the same training method individually. Then, the proposed method uses a multi-attribute decision-making method to develop an action policy integrating the training results of these trained sub-tasks. There are practical advantages to improve learning performance by allowing multiple learners to learn in parallel. An adaptive exploration strategy determines the probability of exploration depending on the information entropy instead of the suffocating work of empirical tuning. Finally, transfer learning extends the applicability of the proposed method. The experiment of the robot migration, the robot confrontation, and the real wheeled mobile robot are used to demonstrate the availability and practicability of the proposed method.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Hu, C., & Xu, M. (2020). Adaptive Exploration Strategy with Multi-Attribute Decision-Making for Reinforcement Learning. IEEE Access, 8, 32353–32364. https://doi.org/10.1109/ACCESS.2020.2973169

Readers' Seniority

PhD / Post grad / Masters / Doc 9

64%

Lecturer / Post doc 3

21%

Professor / Associate Prof. 1

7%

Researcher 1

7%

Readers' Discipline

Engineering 8

57%

Computer Science 4

29%

Business, Management and Accounting 1

7%

Economics, Econometrics and Finance 1

7%

Adaptive Exploration Strategy with Multi-Attribute Decision-Making for Reinforcement Learning

Abstract

Author supplied keywords

References Powered by Scopus

Technical Note: Q-Learning

Finite-time analysis of the multiarmed bandit problem

Hybrid Whale Optimization Algorithm with simulated annealing for feature selection

Cited by Powered by Scopus

“Deep reinforcement learning for engineering design through topology optimization of elementally discretized design domains”

Deep reinforcement learning for the rapid on-demand design of mechanical metamaterials with targeted nonlinear deformation responses

Comparative Analysis of A3C and PPO Algorithms in Reinforcement Learning: A Survey on General Environments

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline