Memory bounded open-loop planning in large POMDPs using thompson sampling

7Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

State-of-the-art approaches to partially observable planning like POMCP are based on stochastic tree search. While these approaches are computationally efficient, they may still construct search trees of considerable size, which could limit the performance due to restricted memory resources. In this paper, we propose Partially Observable Stacked Thompson Sampling (POSTS), a memory bounded approach to open-loop planning in large POMDPs, which optimizes a fixed size stack of Thompson Sampling bandits. We empirically evaluate POSTS in four large benchmark problems and compare its performance with different tree-based approaches. We show that POSTS achieves competitive performance compared to tree-based open-loop planning and offers a performance-memory tradeoff, making it suitable for partially observable planning with highly restricted computational and memory resources.

Cite

CITATION STYLE

APA

Phan, T., Friedrich, M., Belzner, L., Schmid, K., Kiermeier, M., & Linnhoff-Popien, C. (2019). Memory bounded open-loop planning in large POMDPs using thompson sampling. In 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019 (pp. 7941–7948). AAAI Press. https://doi.org/10.1609/aaai.v33i01.33017941

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free