Computational experiments with the RAVE heuristic

2Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The Monte-Carlo tree search algorithm Upper Confidence bounds applied to Trees (UCT) has become extremely popular in computer games research. The Rapid Action Value Estimation (RAVE) heuristic is a strong estimator that often improves the performance of UCT-based algorithms. However, there are situations where RAVE misleads the search whereas pure UCT search can find the correct solution. Two games, the simple abstract game Sum of Switches (SOS) and the game of Go, are used to study the behavior of the RAVE heuristic. In SOS, RAVE updates are manipulated to mimic game situations where RAVE misleads the search. Such false RAVE updates are used to create RAVE overestimates and underestimates. A study of the distributions of mean and RAVE values reveals great differences between Go and SOS. While the RAVE-max update rule is able to correct extreme cases of RAVE underestimation, it is not effective in closer to practical settings and in Go. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Tom, D., & Müller, M. (2011). Computational experiments with the RAVE heuristic. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6515 LNCS, pp. 69–80). https://doi.org/10.1007/978-3-642-17928-0_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free