Heuristic functions make MDP solvers practical by reducing their time and memory requirements. Some of the most effective heuristics (e.g., the FF heuristic function) first determinize the MDP and then solve a relaxation of the resulting classical planning problem (e.g., by ignoring delete effects). While these heuristic functions are fast to compute, they frequently yield overly optimistic value estimates. It is natural to wonder, then, whether the improved estimates of using a full classical planner on the (non-relaxed) determinized domain will provide enough gains to compensate for the vastly increased cost of computation. This paper shows that the answer is "No and Yes". If one uses a full classical planner in the obvious way, the cost of the heuristic function's computation outweighs the benefits. However, we show that one can make the idea practical by generalizing the results of classical planning successes and failures. Specifically, we introduce a novel heuristic function called GOTH that amortizes the cost of classical planning by 1) extracting basis functions from the plans discovered during heuristic computation, 2) using these basis functions to generalize the heuristic value of one state to cover many others, and 3) thus invoking the classical planner many fewer times than there are states. Experiments show that GOTH can provide vast time and memory savings compared to the FF heuristic function - especially on large problems. Copyright © 2010, Association for the Advancement of Artificial Intelligence. All rights reserved.
CITATION STYLE
Kolobov, A., Mausam, & Weld, D. S. (2010). Classical planning in MDP heuristics: With a little help from generalization. In ICAPS 2010 - Proceedings of the 20th International Conference on Automated Planning and Scheduling (pp. 97–104). https://doi.org/10.1609/icaps.v20i1.13424
Mendeley helps you to discover research relevant for your work.