A stochastic point-based algorithm for POMDPs

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We introduce a new backup operator for point-based POMDP algorithms which performs a look-ahead search at depth greater than one. We apply this operator into a new algorithm, called Stochastic Search Value Iteration (SSVI). This new algorithm relies on stochastic exploration of the environment in order to update the value function. This is in opposition with existing POMDP point-based algorithms. The underlying ideas on which SSVI is based are very similar to temporal difference learning algorithms for MDPs. In particular, SSVI takes advantage of a soft-max action selection function and of the random character of the environment itself. Empirical results on usual benchmark problems show that our algorithm performs a bit better and a bit faster than HSVI2, the state of the art algorithm. This suggests that stochastic algorithms are an alternative for solving large POMDPs. © 2008 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Laviolette, F., & Tobin, L. (2008). A stochastic point-based algorithm for POMDPs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5032 LNAI, pp. 332–343). https://doi.org/10.1007/978-3-540-68825-9_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free