MAGIC: Learning Macro-Actions for Online POMDP Planning

17Citations
Citations of this article
33Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The partially observable Markov decision process (POMDP) is a principled general framework for robot decision making under uncertainty, but POMDP planning suffers from high computational complexity, when long-term planning is required. While temporally-extended macro-actions help to cut down the effective planning horizon and significantly improve computational efficiency, how do we acquire good macroactions? This paper proposes Macro-Action Generator-Critic (MAGIC), which performs offline learning of macro-actions optimized for online POMDP planning. Specifically, MAGIC learns a macro-action generator end-to-end, using an online planner’s performance as the feedback. During online planning, the generator generates on the fly situation-aware macro-actions conditioned on the robot’s belief and the environment context. We evaluated MAGIC on several long-horizon planning tasks both in simulation and on a real robot. The experimental results show that the learned macro-actions offer significant benefits in online planning performance, compared with primitive actions and handcrafted macro-actions.

Cite

CITATION STYLE

APA

Lee, Y., Cai, P., & Hsu, D. (2021). MAGIC: Learning Macro-Actions for Online POMDP Planning. In Robotics: Science and Systems. Massachusetts Institute of Technology. https://doi.org/10.15607/RSS.2021.XVII.041

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free