Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision Processes

3Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

We present a novel approach to hierarchical reinforcement learning for linearly-solvable Markov decision processes. Our approach assumes that the state space is partitioned, and defines subtasks for moving between the partitions. We represent value functions on several levels of abstraction, and use the compositionality of subtasks to estimate the optimal values of the states in each partition. The policy is implicitly defined on these optimal value estimates, rather than being decomposed among the subtasks. As a consequence, our approach can learn the globally optimal policy, and does not suffer from non-stationarities induced by high-level decisions. If several partitions have equivalent dynamics, the subtasks of those partitions can be shared. We show that our approach is significantly more sample efficient than that of a flat learner and similar hierarchical approaches when the set of boundary states is smaller than the entire state space.

Cite

CITATION STYLE

APA

Infante, G., Jonsson, A., & Gómez, V. (2022). Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision Processes. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 (Vol. 36, pp. 6970–6977). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v36i6.20655

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free