Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision Processes

Guillermo Infante; Anders Jonsson; Vicenç Gómez

Conference ProceedingsOPEN ACCESS

Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision Processes

Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 (2022) 36 6970-6977

DOI: 10.1609/aaai.v36i6.20655

3Citations

7Readers

Abstract

We present a novel approach to hierarchical reinforcement learning for linearly-solvable Markov decision processes. Our approach assumes that the state space is partitioned, and defines subtasks for moving between the partitions. We represent value functions on several levels of abstraction, and use the compositionality of subtasks to estimate the optimal values of the states in each partition. The policy is implicitly defined on these optimal value estimates, rather than being decomposed among the subtasks. As a consequence, our approach can learn the globally optimal policy, and does not suffer from non-stationarities induced by high-level decisions. If several partitions have equivalent dynamics, the subtasks of those partitions can be shared. We show that our approach is significantly more sample efficient than that of a flat learner and similar hierarchical approaches when the set of boundary states is smaller than the entire state space.

Cite

CITATION STYLE

APA

Infante, G., Jonsson, A., & Gómez, V. (2022). Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision Processes. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 (Vol. 36, pp. 6970–6977). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v36i6.20655

Globally Optimal Hierarchical Reinforcement Learning for Linearly-Solvable Markov Decision Processes

Abstract

Cite

Register to see more suggestions