Bandit based Monte-Carlo planning

2.2kCitations
Citations of this article
1.2kReaders
Mendeley users who have this article in their library.

This article is free to access.

Abstract

For large state-space Markovian Decision Problems Monte-Carlo planning is one of the few viable approaches to find near-optimal solutions. In this paper we introduce a new algorithm, UCT, that applies bandit ideas to guide Monte-Carlo planning. In finite-horizon or discounted MDPs the algorithm is shown to be consistent and finite sample bounds are derived on the estimation error due to sampling. Experimental results show that in several domains, UCT is significantly more efficient than its alternatives. © Springer-Verlag Berlin Heidelberg 2006.

References Powered by Scopus

Finite-time analysis of the multiarmed bandit problem

4972Citations
N/AReaders
Get full text

Asymptotically efficient adaptive allocation rules

2130Citations
N/AReaders
Get full text

The nonstochastic multiarmed bandit problem

1836Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Mastering the game of Go with deep neural networks and tree search

13032Citations
N/AReaders
Get full text

Mastering the game of Go without human knowledge

7252Citations
N/AReaders
Get full text

Taking the human out of the loop: A review of Bayesian optimization

3998Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Kocsis, L., & Szepesvári, C. (2006). Bandit based Monte-Carlo planning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4212 LNAI, pp. 282–293). Springer Verlag. https://doi.org/10.1007/11871842_29

Readers over time

‘09‘10‘11‘12‘13‘14‘15‘16‘17‘18‘19‘20‘21‘22‘23‘24‘25050100150200

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 629

73%

Researcher 148

17%

Professor / Associate Prof. 73

8%

Lecturer / Post doc 13

2%

Readers' Discipline

Tooltip

Computer Science 638

73%

Engineering 176

20%

Mathematics 36

4%

Agricultural and Biological Sciences 23

3%

Article Metrics

Tooltip
Mentions
News Mentions: 1
References: 4

Save time finding and organizing research with Mendeley

Sign up for free
0