In this paper, we consider reinforcement learning in systems with unknown environment where the agent must trade off efficiently between: exploration (long-term optimization) and exploitation (short-term optimization), e -greedy algorithm is a method using near-greedy action selection rule. It behaves greedily (exploitation) most of the time, but every once in a while, say with small probability ε (exploration), instead select an action at random. Many works already proved that random exploration drives the agent towards poorly modeled states. Therefore, this study evaluates the role of heuristic based exploration in reinforcement learning. We proposed three methods: neighborhood search based exploration, simulated annealing based exploration, and tabu search based exploration. All techniques follow the same rule "Explore the most unvisited state", In the simulation, these techniques are evaluated and compared on a discrete reinforcement learning task (robot navigation). © Springer-Verlag Berlin Heidelberg 2007.
CITATION STYLE
Vien, N. A., Viet, N. H., Lee, S. G., & Chung, T. C. (2007). Heuristic search based exploration in reinforcement learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4507 LNCS, pp. 110–118). https://doi.org/10.1007/978-3-540-73007-1_14
Mendeley helps you to discover research relevant for your work.