Sparse gradient-based direct policy search

Nataliya Sokolovska

Conference Proceedings

Sparse gradient-based direct policy search

Sokolovska N

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7666 LNCS(PART 4) 212-221

DOI: 10.1007/978-3-642-34478-7_27

0Citations

2Readers

Get full text

Abstract

Reinforcement learning is challenging if state and action spaces are continuous. The discretization of state and action spaces and real-time adaptation of the discretization are critical issues in reinforcement learning problems. In our contribution we consider the adaptive discretization, and introduce a sparse gradient-based direct policy search method. We address the issue of efficient states/actions selection in the gradient-based direct policy search based on imposing sparsity through the L 1 penalty term. We propose to start learning with a fine discretization of state space and to induce sparsity via the L 1 norm. We compare the proposed approach to state-of-the art methods, such as progressive widening Q-learning which updates the discretization of the states adaptively, and to classic as well as sparse Q-learning with linear function approximation. We demonstrate by our experiments on standard reinforcement learning challenges that the proposed approach is efficient. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Sokolovska, N. (2012). Sparse gradient-based direct policy search. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7666 LNCS, pp. 212–221). https://doi.org/10.1007/978-3-642-34478-7_27

Sparse gradient-based direct policy search

Abstract

Author supplied keywords

Cite

Register to see more suggestions