Memorizing the Playout Policy

Tristan Cazenave; Eustache Diemert

Conference Proceedings

Memorizing the Playout Policy

Cazenave T
Diemert E

Communications in Computer and Information Science (2018) 818 96-107

DOI: 10.1007/978-3-319-75931-9_7

1Citations

2Readers

Get full text

Abstract

Monte Carlo Tree Search (MCTS) is the state-of-the-art algorithm for General Game Playing (GGP). Playout Policy Adaptation with move Features (PPAF) is a state-of-the-art MCTS algorithm that learns a playout policy online. We propose a simple modification to PPAF consisting in memorizing the learned policy from one move to the next. We test PPAF with memorization (PPAFM) against PPAF and UCT for Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Domineering, Knightthrough, Misere Knightthrough and Nogo.

Cite

CITATION STYLE

APA

Cazenave, T., & Diemert, E. (2018). Memorizing the Playout Policy. In Communications in Computer and Information Science (Vol. 818, pp. 96–107). Springer Verlag. https://doi.org/10.1007/978-3-319-75931-9_7

Memorizing the Playout Policy

Abstract

Cite

Register to see more suggestions