Heuristic search based exploration in reinforcement learning

14Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we consider reinforcement learning in systems with unknown environment where the agent must trade off efficiently between: exploration (long-term optimization) and exploitation (short-term optimization), e -greedy algorithm is a method using near-greedy action selection rule. It behaves greedily (exploitation) most of the time, but every once in a while, say with small probability ε (exploration), instead select an action at random. Many works already proved that random exploration drives the agent towards poorly modeled states. Therefore, this study evaluates the role of heuristic based exploration in reinforcement learning. We proposed three methods: neighborhood search based exploration, simulated annealing based exploration, and tabu search based exploration. All techniques follow the same rule "Explore the most unvisited state", In the simulation, these techniques are evaluated and compared on a discrete reinforcement learning task (robot navigation). © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Vien, N. A., Viet, N. H., Lee, S. G., & Chung, T. C. (2007). Heuristic search based exploration in reinforcement learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4507 LNCS, pp. 110–118). https://doi.org/10.1007/978-3-540-73007-1_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free