Sparse gradient-based direct policy search

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Reinforcement learning is challenging if state and action spaces are continuous. The discretization of state and action spaces and real-time adaptation of the discretization are critical issues in reinforcement learning problems. In our contribution we consider the adaptive discretization, and introduce a sparse gradient-based direct policy search method. We address the issue of efficient states/actions selection in the gradient-based direct policy search based on imposing sparsity through the L 1 penalty term. We propose to start learning with a fine discretization of state space and to induce sparsity via the L 1 norm. We compare the proposed approach to state-of-the art methods, such as progressive widening Q-learning which updates the discretization of the states adaptively, and to classic as well as sparse Q-learning with linear function approximation. We demonstrate by our experiments on standard reinforcement learning challenges that the proposed approach is efficient. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Sokolovska, N. (2012). Sparse gradient-based direct policy search. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7666 LNCS, pp. 212–221). https://doi.org/10.1007/978-3-642-34478-7_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free