Combining exploitation-based and exploration-based approach in reinforcement learning

4Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Watkins' Q-learning is the most popular and an e_ective model-free method. However, comparing model-based approach, Q-learning with various exploration strategies require a large number of trial-and- error interactions for _nding an optimal policy. To overcome this draw- back, we propose a new model-based learning method extending Q- learning. This method has separated EI and ER functions for learning exploitation-based and exploration-based model, respectively. EI func- tion based on statistics indicates the best action. The another ER func- tion based on the information of exploration leads the learner to well- unknown region in the global state space by backing up in each step. Then, we introduce a new criterion as the information of exploration. Using combined these function, we can effectively proceed exploitation and exploration strategies and can select an action which considers each strategy simultaneously.

Cite

CITATION STYLE

APA

Iwata, K., Ito, N., Yamauchi, K., & Ishii, N. (2000). Combining exploitation-based and exploration-based approach in reinforcement learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1983, pp. 326–331). Springer Verlag. https://doi.org/10.1007/3-540-44491-2_47

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free