Smarter sampling in model-based bayesian reinforcement learning

Pablo Samuel Castro; Doina Precup

Conference ProceedingsOPEN ACCESS

Smarter sampling in model-based bayesian reinforcement learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2010) 6321 LNAI(PART 1) 200-214

DOI: 10.1007/978-3-642-15880-3_19

13Citations

19Readers

Abstract

Bayesian reinforcement learning (RL) is aimed at making more efficient use of data samples, but typically uses significantly more computation. For discrete Markov Decision Processes, a typical approach to Bayesian RL is to sample a set of models from an underlying distribution, and compute value functions for each, e.g. using dynamic programming. This makes the computation cost per sampled model very high. Furthermore, the number of model samples to take at each step has mainly been chosen in an ad-hoc fashion. We propose a principled method for determining the number of models to sample, based on the parameters of the posterior distribution over models. Our sampling method is local, in that we may choose a different number of samples for each state-action pair. We establish bounds on the error in the value function between a random model sample and the mean model from the posterior distribution. We compare our algorithm against state-of-the-art methods and demonstrate that our method provides a better trade-off between performance and running time. © 2010 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Castro, P. S., & Precup, D. (2010). Smarter sampling in model-based bayesian reinforcement learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6321 LNAI, pp. 200–214). https://doi.org/10.1007/978-3-642-15880-3_19

Smarter sampling in model-based bayesian reinforcement learning

Abstract

Cite

Register to see more suggestions