Using MDP characteristics to guide exploration in reinforcement learning

8Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We present a new approach for exploration in Reinforcement Learning (RL) based on certain properties of the Markov Decision Processes (MDP). Our strategy facilitates a more uniform visitation of the state space, a more extensive sampling of actions with potentially high variance of the action-value function estimates, and encourages the RL agent to focus on states where it has most control over the outcomes of its actions. Our exploration strategy can be used in combination with other existing exploration techniques, and we experimentally demonstrate that it can improve the performance of both undirected and directed exploration methods. In contrast to other directed methods, the exploration-relevant information can be precomputed beforehand and then used during learning without additional computation cost.

References Powered by Scopus

Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming

914Citations
N/AReaders
Get full text

Convergence results for single-step on-policy reinforcement-learning algorithms

504Citations
N/AReaders
Get full text

SURVEY OF SOME RESULTS IN STOCHASTIC ADAPTIVE CONTROL.

141Citations
N/AReaders
Get full text

Cited by Powered by Scopus

An information-theoretic approach to curiosity-driven reinforcement learning

128Citations
N/AReaders
Get full text

Characterizing reinforcement learning methods through parameterized learning problems

13Citations
N/AReaders
Get full text

Sample complexity bounds of exploration

11Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Ratitch, B., & Precup, D. (2003). Using MDP characteristics to guide exploration in reinforcement learning. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 2837, pp. 313–324). Springer Verlag. https://doi.org/10.1007/978-3-540-39857-8_29

Readers over time

‘09‘10‘12‘17‘19‘21‘2400.511.52

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 3

50%

Researcher 3

50%

Readers' Discipline

Tooltip

Computer Science 4

67%

Engineering 2

33%

Save time finding and organizing research with Mendeley

Sign up for free
0