Heuristics for planning with penalties and rewards formulated in logic and computed through circuits

9Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The automatic derivation of heuristic functions for guiding the search for plans is a fundamental technique in planning. The type of heuristics that have been considered so far, however, deal only with simple planning models where costs are associated with actions but not with states. In this work we address this limitation by formulating a more expressive planning model and a corresponding heuristic where preferences in the form of penalties and rewards are associated with fluents as well. The heuristic, that is a generalization of the well-known delete-relaxation heuristic, is admissible, informative, but intractable. Exploiting a correspondence between heuristics and preferred models, and a property of formulas compiled in d-DNNF, we show however that if a suitable relaxation of the domain, expressed as the strong completion of a logic program with no time indices or horizon is compiled into d-DNNF, the heuristic can be computed for any search state in time that is linear in the size of the compiled representation. This representation defines an evaluation network or circuit that maps states into heuristic values in linear-time. While this circuit may have exponential size in the worst case, as for OBDDs, this is not necessarily so. We report empirical results, discuss the application of the framework in settings where there are no goals but just preferences, and illustrate the versatility of the account by developing a new heuristic that overcomes limitations of delete-based relaxations through the use of valid but implicit plan constraints. In particular, for the Traveling Salesman Problem, the new heuristic captures the exact cost while the delete-relaxation heuristic, which is also exponential in the worst case, captures only the Minimum Spanning Tree lower bound. © 2008 Elsevier B.V. All rights reserved.

Cite

CITATION STYLE

APA

Bonet, B., & Geffner, H. (2008). Heuristics for planning with penalties and rewards formulated in logic and computed through circuits. Artificial Intelligence, 172(12–13), 1579–1604. https://doi.org/10.1016/j.artint.2008.03.004

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free