Bayesian reinforcement learning with exploration

4Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We consider a general reinforcement learning problem and show that carefully combining the Bayesian optimal policy and an exploring policy leads to minimax sample-complexity bounds in a very general class of (history-based) environments. We also prove lower bounds and show that the new algorithm displays adaptive behaviour when the environment is easier than worst-case.

Cite

CITATION STYLE

APA

Lattimore, T., & Hutter, M. (2014). Bayesian reinforcement learning with exploration. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8776, pp. 170–184). Springer Verlag. https://doi.org/10.1007/978-3-319-11662-4_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free