Combining exploitation-based and exploration-based approach in reinforcement learning

Kazunori Iwata; Nobuhiro Ito; Koichiro Yamauchi; Naohiro Ishii

Conference Proceedings

Combining exploitation-based and exploration-based approach in reinforcement learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2000) 1983 326-331

DOI: 10.1007/3-540-44491-2_47

4Citations

3Readers

Get full text

Abstract

Watkins' Q-learning is the most popular and an e_ective model-free method. However, comparing model-based approach, Q-learning with various exploration strategies require a large number of trial-and- error interactions for _nding an optimal policy. To overcome this draw- back, we propose a new model-based learning method extending Q- learning. This method has separated EI and ER functions for learning exploitation-based and exploration-based model, respectively. EI func- tion based on statistics indicates the best action. The another ER func- tion based on the information of exploration leads the learner to well- unknown region in the global state space by backing up in each step. Then, we introduce a new criterion as the information of exploration. Using combined these function, we can effectively proceed exploitation and exploration strategies and can select an action which considers each strategy simultaneously.

Cite

CITATION STYLE

APA

Iwata, K., Ito, N., Yamauchi, K., & Ishii, N. (2000). Combining exploitation-based and exploration-based approach in reinforcement learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1983, pp. 326–331). Springer Verlag. https://doi.org/10.1007/3-540-44491-2_47

Combining exploitation-based and exploration-based approach in reinforcement learning

Abstract

Cite

Register to see more suggestions