Data-efficient reinforcement learning using active exploration method

Dongfang Zhao; Jiafeng Liu; Rui Wu; Dansong Cheng; Xianglong Tang

Conference Proceedings

Data-efficient reinforcement learning using active exploration method

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11303 LNCS 265-276

DOI: 10.1007/978-3-030-04182-3_24

0Citations

2Readers

Get full text

Abstract

Reinforcement learning (RL) is an effective method to control dynamic system without prior knowledge. One of the most important and difficult problem in RL is how to improve data efficiency. PILCO is a state-of-art data-efficient framework which uses Gaussian Process (GP) to model dynamic. However, it only focuses on optimizing cumulative rewards, and does not consider the accuracy of dynamic model which is an important factor for controller learning. To further improve the data-efficiency of PILCO, we propose an active exploration version of PILCO (AEPILCO) which utilizes information entropy to describe samples. In policy evaluation stage, we incorporate information entropy criterion into long term sample prediction. With the informative policy evaluation function, our algorithm obtains informative policy parameters in policy improvement stage. Using the policy parameters in real execution will produce informative sample set which is helpful to learn accurate dynamic model. Thus our AEPILCO algorithm improves data efficiency through learning an accurate dynamic model by actively selecting informative samples with information-entropy criterion. We demonstrate the validity and efficiency of the proposed algorithm for several challenging controller problems involving cart-pole, pendubot, double-pendulum and cart-double-pendulum. The proposed AEPILCO algorithm can learn controller using less trials which is verified by both theoretical analysis and experimental results.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhao, D., Liu, J., Wu, R., Cheng, D., & Tang, X. (2018). Data-efficient reinforcement learning using active exploration method. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11303 LNCS, pp. 265–276). Springer Verlag. https://doi.org/10.1007/978-3-030-04182-3_24

Data-efficient reinforcement learning using active exploration method

Abstract

Author supplied keywords

Cite

Register to see more suggestions