MAGIC: Learning Macro-Actions for Online POMDP Planning

Yiyuan Lee; Panpan Cai; David Hsu

Conference ProceedingsOPEN ACCESS

MAGIC: Learning Macro-Actions for Online POMDP Planning

Robotics: Science and Systems (2021)

DOI: 10.15607/RSS.2021.XVII.041

17Citations

33Readers

Get full text

Abstract

The partially observable Markov decision process (POMDP) is a principled general framework for robot decision making under uncertainty, but POMDP planning suffers from high computational complexity, when long-term planning is required. While temporally-extended macro-actions help to cut down the effective planning horizon and significantly improve computational efficiency, how do we acquire good macroactions? This paper proposes Macro-Action Generator-Critic (MAGIC), which performs offline learning of macro-actions optimized for online POMDP planning. Specifically, MAGIC learns a macro-action generator end-to-end, using an online planner’s performance as the feedback. During online planning, the generator generates on the fly situation-aware macro-actions conditioned on the robot’s belief and the environment context. We evaluated MAGIC on several long-horizon planning tasks both in simulation and on a real robot. The experimental results show that the learned macro-actions offer significant benefits in online planning performance, compared with primitive actions and handcrafted macro-actions.

Cite

CITATION STYLE

APA

Lee, Y., Cai, P., & Hsu, D. (2021). MAGIC: Learning Macro-Actions for Online POMDP Planning. In Robotics: Science and Systems. Massachusetts Institute of Technology. https://doi.org/10.15607/RSS.2021.XVII.041

MAGIC: Learning Macro-Actions for Online POMDP Planning

Abstract

Cite

Register to see more suggestions