Combining reinforcement learning with symbolic planning

13Citations
Citations of this article
45Readers
Mendeley users who have this article in their library.
Get full text

Abstract

One of the major difficulties in applying Q-learning to real-world domains is the sharp increase in the number of learning steps required to converge towards an optimal policy as the size of the state space is increased. In this paper we propose a method, PLANQ-learning, that couples a Q-learner with a STRIPS planner. The planner shapes the reward function, and thus guides the Q-learner quickly to the optimal policy. We demonstrate empirically that this combination of high-level reasoning and low-level learning displays significant improvements in scaling-up behaviour as the state-space grows larger, compared to both standard Q-learning and hierarchical Q-learning methods. © 2008 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Grounds, M., & Kudenko, D. (2008). Combining reinforcement learning with symbolic planning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4865 LNAI, pp. 75–86). https://doi.org/10.1007/978-3-540-77949-0_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free