Improved diversity in nested rollout policy adaptation

6Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

For combinatorial search in single-player games nested Monte-Carlo search is an apparent alternative to algorithms like UCT that are applied in two-player and general games. To trade exploration with exploitation the randomized search procedure intensifies the search with increasing recursion depth. If a concise mapping from states to actions is available, the integration of policy learning yields nested rollout with policy adaptation (NRPA), while Beam-NRPA keeps a bounded number of solutions in each recursion level. In this paper we propose refinements for Beam-NRPA that improve the runtime and the solution diversity.

Cite

CITATION STYLE

APA

Edelkamp, S., & Cazenave, T. (2016). Improved diversity in nested rollout policy adaptation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9904 LNAI, pp. 43–55). Springer Verlag. https://doi.org/10.1007/978-3-319-46073-4_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free