Heuristic search based exploration in reinforcement learning

Ngo Anh Vien; Nguyen Hoang Viet; Seung Gwan Lee; Tae Choong Chung

Conference Proceedings

Heuristic search based exploration in reinforcement learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007) 4507 LNCS 110-118

DOI: 10.1007/978-3-540-73007-1_14

14Citations

8Readers

Get full text

Abstract

In this paper, we consider reinforcement learning in systems with unknown environment where the agent must trade off efficiently between: exploration (long-term optimization) and exploitation (short-term optimization), e -greedy algorithm is a method using near-greedy action selection rule. It behaves greedily (exploitation) most of the time, but every once in a while, say with small probability ε (exploration), instead select an action at random. Many works already proved that random exploration drives the agent towards poorly modeled states. Therefore, this study evaluates the role of heuristic based exploration in reinforcement learning. We proposed three methods: neighborhood search based exploration, simulated annealing based exploration, and tabu search based exploration. All techniques follow the same rule "Explore the most unvisited state", In the simulation, these techniques are evaluated and compared on a discrete reinforcement learning task (robot navigation). © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Vien, N. A., Viet, N. H., Lee, S. G., & Chung, T. C. (2007). Heuristic search based exploration in reinforcement learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4507 LNCS, pp. 110–118). https://doi.org/10.1007/978-3-540-73007-1_14

Heuristic search based exploration in reinforcement learning

Abstract

Cite

Register to see more suggestions