Focused topological value iteration

12Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

Abstract

Topological value iteration (TVI) is an effective algorithm for solving Markov decision processes (MDPs) optimally, which 1) divides an MDP into strongly-connected components, and 2) solves these components sequentially. Yet, TVI's usefulness tends to degrade if an MDP has large components, because the cost of the division process isn't offset by gains during solution. This paper presents a new algorithm to solve MDPs optimally, focused topological value iteration (FTVI). FTVI addresses TVI's limitations by restricting its attention to connected components that are relevant for solving the MDP. Specifically, FTVI uses a small amount of heuristic search to eliminate provably sub-optimal actions; this pruning allows FTVI to find smaller connected components, thus running faster. We demonstrate that our new algorithm outperforms TVI by an order of magnitude, averaged across several domains. Surprisingly, FTVI also significantly outperforms popular 'heuristically-informed' MDP algorithms such as LAO*, LRTDP, and BRTDP in many domains, sometimes by as much as two orders of magnitude. Finally, we characterize the type of domains where FTVI excels -suggesting a way to an informed choice of solver. Copyright © 2009, Association for the Advancement of Artificial Intelligence. All rights reserved.

Cite

CITATION STYLE

APA

Dai, P., Mausam, & Weld, D. S. (2009). Focused topological value iteration. In ICAPS 2009 - Proceedings of the 19th International Conference on Automated Planning and Scheduling (pp. 82–89). https://doi.org/10.1609/icaps.v19i1.18138

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free