Heuristic planning has a central role in classical planning applications and competitions. Thanks to this success, there has been an increasing interest in using Deep Learning to create high-quality heuristics in a supervised fashion, learning from optimal solutions of previously solved planning problems. Meta-Reinforcement learning is a fast growing research area concerned with learning, from many tasks, behaviours that can quickly generalize to new tasks from the same distribution of the training ones. We make a connection between meta-reinforcement learning and heuristic planning, showing that heuristic functions meta-learned from planning problems, in a given domain, can outperform both popular domain-independent heuristics, and heuristics learned by supervised learning. Furthermore, while most supervised learning algorithms rely on ad-hoc encodings of the state representation, our method uses as input a general PDDL 3.1 description. We evaluated our heuristic with an A* planner on six domains from the International Planning Competition and the FF Domain Collection, showing that the meta-learned heuristic leads to the expansion, on average, of fewer states than three popular heuristics used by the FastDownward planner, and a supervised-learned heuristic.
CITATION STYLE
Gutierrez, R. L., & Leonetti, M. (2021). Meta-Reinforcement Learning for Heuristic Planing. In Proceedings International Conference on Automated Planning and Scheduling, ICAPS (Vol. 2021-August, pp. 551–559). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/icaps.v31i1.16003
Mendeley helps you to discover research relevant for your work.