Improving exploration in UCT using local manifolds

6Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.

Abstract

Monte Carlo planning has been proven successful in many sequential decision-making settings, but it suffers from poor exploration when the rewards are sparse. In this paper, we improve exploration in UCT by generalizing across similar states using a given distance metric. When the state space does not have a natural distance metric, we show how we can learn a local manifold from the transition graph of states in the near future, to obtain a distance metric. On domains inspired by video games, empirical evidence shows that our algorithm is more sample efficient than UCT, particularly when rewards are sparse.

Cite

CITATION STYLE

APA

Srinivasan, S., Talvitie, E., & Bowling, M. (2015). Improving exploration in UCT using local manifolds. In Proceedings of the National Conference on Artificial Intelligence (Vol. 5, pp. 3386–3392). AI Access Foundation. https://doi.org/10.1609/aaai.v29i1.9660

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free