Adaptive Exploration Strategy with Multi-Attribute Decision-Making for Reinforcement Learning

16Citations
Citations of this article
27Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Reinforcement Learning (RL) agents often encounter the bottleneck of the performance when the dilemma of exploration and exploitation arises. In this study, an adaptive exploration strategy with multi-attribute decision-making is proposed to address the trade-off problem between exploration and exploitation. Firstly, the proposed method decomposes a complex task into several sub-tasks and trains each sub-task using the same training method individually. Then, the proposed method uses a multi-attribute decision-making method to develop an action policy integrating the training results of these trained sub-tasks. There are practical advantages to improve learning performance by allowing multiple learners to learn in parallel. An adaptive exploration strategy determines the probability of exploration depending on the information entropy instead of the suffocating work of empirical tuning. Finally, transfer learning extends the applicability of the proposed method. The experiment of the robot migration, the robot confrontation, and the real wheeled mobile robot are used to demonstrate the availability and practicability of the proposed method.

References Powered by Scopus

Technical Note: Q-Learning

12440Citations
N/AReaders
Get full text

Finite-time analysis of the multiarmed bandit problem

5269Citations
N/AReaders
Get full text

Hybrid Whale Optimization Algorithm with simulated annealing for feature selection

1110Citations
N/AReaders
Get full text

Cited by Powered by Scopus

“Deep reinforcement learning for engineering design through topology optimization of elementally discretized design domains”

39Citations
N/AReaders
Get full text

Deep reinforcement learning for the rapid on-demand design of mechanical metamaterials with targeted nonlinear deformation responses

31Citations
N/AReaders
Get full text

Comparative Analysis of A3C and PPO Algorithms in Reinforcement Learning: A Survey on General Environments

27Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Hu, C., & Xu, M. (2020). Adaptive Exploration Strategy with Multi-Attribute Decision-Making for Reinforcement Learning. IEEE Access, 8, 32353–32364. https://doi.org/10.1109/ACCESS.2020.2973169

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 9

64%

Lecturer / Post doc 3

21%

Professor / Associate Prof. 1

7%

Researcher 1

7%

Readers' Discipline

Tooltip

Engineering 8

57%

Computer Science 4

29%

Business, Management and Accounting 1

7%

Economics, Econometrics and Finance 1

7%

Save time finding and organizing research with Mendeley

Sign up for free