Improved diversity in nested rollout policy adaptation

Stefan Edelkamp; Tristan Cazenave

Conference Proceedings

Improved diversity in nested rollout policy adaptation

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9904 LNAI 43-55

DOI: 10.1007/978-3-319-46073-4_4

6Citations

3Readers

Get full text

Abstract

For combinatorial search in single-player games nested Monte-Carlo search is an apparent alternative to algorithms like UCT that are applied in two-player and general games. To trade exploration with exploitation the randomized search procedure intensifies the search with increasing recursion depth. If a concise mapping from states to actions is available, the integration of policy learning yields nested rollout with policy adaptation (NRPA), while Beam-NRPA keeps a bounded number of solutions in each recursion level. In this paper we propose refinements for Beam-NRPA that improve the runtime and the solution diversity.

Cite

CITATION STYLE

APA

Edelkamp, S., & Cazenave, T. (2016). Improved diversity in nested rollout policy adaptation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9904 LNAI, pp. 43–55). Springer Verlag. https://doi.org/10.1007/978-3-319-46073-4_4

Improved diversity in nested rollout policy adaptation

Abstract

Cite

Register to see more suggestions