Adapting improved upper confidence bounds for monte-carlo tree search

Yun Ching Liu; Yoshimasa Tsuruoka

Conference Proceedings

Adapting improved upper confidence bounds for monte-carlo tree search

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9525 53-64

DOI: 10.1007/978-3-319-27992-3_6

1Citations

5Readers

Get full text

Abstract

The UCT algorithm, which combines the UCB algorithm and Monte-Carlo Tree Search (MCTS), is currently the most widely used variant of MCTS. Recently, a number of investigations into applying other bandit algorithms to MCTS have produced interesting results. In this research, we will investigate the possibility of combining the improved UCB algorithm, proposed by Auer et al. [2], with MCTS. However, various characteristics and properties of the improved UCB algorithm may not be ideal for a direct application to MCTS. Therefore, some modifications were made to the improved UCB algorithm, making it more suitable for the task of game-tree search. The Mi-UCT algorithm is the application of the modified UCB algorithm applied to trees. The performance of Mi-UCT is demonstrated on the games of 9 ×9 Go and 9 × 9 NoGo, and has shown to outperform the plain UCT algorithm when only a small number of playouts are given, and rougly on the same level when more playouts are available.

Cite

CITATION STYLE

APA

Liu, Y. C., & Tsuruoka, Y. (2015). Adapting improved upper confidence bounds for monte-carlo tree search. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9525, pp. 53–64). Springer Verlag. https://doi.org/10.1007/978-3-319-27992-3_6

Adapting improved upper confidence bounds for monte-carlo tree search

Abstract

Cite

Register to see more suggestions