Bayesian Optimized Monte Carlo Planning

John Mern; Anil Yildiz; Zachary Sunberg; Tapan Mukerji; Mykel J. Kochenderfer

Conference ProceedingsOPEN ACCESS

Bayesian Optimized Monte Carlo Planning

35th AAAI Conference on Artificial Intelligence, AAAI 2021 (2021) 13B 11880-11887

DOI: 10.1609/aaai.v35i13.17411

21Citations

50Readers

Abstract

Online solvers for partially observable Markov decision processes have difficulty scaling to problems with large action spaces. Monte Carlo tree search with progressive widening attempts to improve scaling by sampling from the action space to construct a policy search tree. The performance of progressive widening search is dependent upon the action sampling policy, often requiring problem-specific samplers. In this work, we present a general method for efficient action sampling based on Bayesian optimization. The proposed method uses a Gaussian process to model a belief over the action-value function and selects the action that will maximize the expected improvement in the optimal action value. We implement the proposed approach in a new online tree search algorithm called Bayesian Optimized Monte Carlo Planning (BOMCP). Several experiments show that BOMCP is better able to scale to large action space POMDPs than existing state-of-the-art tree search solvers.

Cite

CITATION STYLE

APA

Mern, J., Yildiz, A., Sunberg, Z., Mukerji, T., & Kochenderfer, M. J. (2021). Bayesian Optimized Monte Carlo Planning. In 35th AAAI Conference on Artificial Intelligence, AAAI 2021 (Vol. 13B, pp. 11880–11887). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v35i13.17411

Bayesian Optimized Monte Carlo Planning

Abstract

Cite

Register to see more suggestions