DESPOT-α: Online POMDP Planning With Large State And Observation Spaces

Neha P. Garg; David Hsu; Wee Sun Lee

Conference ProceedingsOPEN ACCESS

DESPOT-α: Online POMDP Planning With Large State And Observation Spaces

Robotics: Science and Systems (2019)

DOI: 10.15607/RSS.2019.XV.006

41Citations

52Readers

Get full text

Abstract

State-of-the-art sampling-based online POMDP solvers compute near-optimal policies for POMDPs with very large state spaces. However, when faced with large observation spaces, they may become overly optimistic and compute suboptimal policies, because of particle divergence. This paper presents a new online POMDP solver DESPOT-α, which builds upon the widely used DESPOT solver. DESPOT-α improves the practical performance of online planning for POMDPs with large observation as well as state spaces. Like DESPOT, DESPOT-α uses the particle belief approximation and searches a deter-minized sparse belief tree. To tackle large observation spaces, DESPOT-α shares sub-policies among many observations during online policy computation. The value function of a sub-policy is a linear function of the belief, commonly known as α-vector. We introduce a particle approximation of the α-vector to improve the efficiency of online policy search. We further speed up DESPOT-α using CPU and GPU parallelization ideas introduced in HyP-DESPOT. Experimental results show that DESPOT-α/HyP-DESPOT-α outperform DESPOT/HyP-DESPOT on POMDPs with large observation spaces, including a complex simulation task involving an autonomous vehicle driving among many pedestrians.

Cite

CITATION STYLE

APA

Garg, N. P., Hsu, D., & Lee, W. S. (2019). DESPOT-α: Online POMDP Planning With Large State And Observation Spaces. In Robotics: Science and Systems. MIT Press Journals. https://doi.org/10.15607/RSS.2019.XV.006

DESPOT-α: Online POMDP Planning With Large State And Observation Spaces

Abstract

Cite

Register to see more suggestions