PAC optimal exploration in continuous space Markov decision processes

54Citations
Citations of this article
73Readers
Mendeley users who have this article in their library.

Abstract

Current exploration algorithms can be classified in two broad categories: Heuristic, and PAC optimal. While numerous researchers have used heuristic approaches such as ε-greedy exploration successfully, such approaches lack formal, finite sample guarantees and may need a significant amount of finetuning to produce good results. PAC optimal exploration algorithms, on the other hand, offer strong theoretical guarantees but are inapplicable in domains of realistic size. The goal of this paper is to bridge the gap between theory and practice, by introducing C-PACE, an algorithm which offers strong theoretical guarantees and can be applied to interesting, continuous space problems. Copyright © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Cite

CITATION STYLE

APA

Pazis, J., & Parr, R. (2013). PAC optimal exploration in continuous space Markov decision processes. In Proceedings of the 27th AAAI Conference on Artificial Intelligence, AAAI 2013 (pp. 774–781). https://doi.org/10.1609/aaai.v27i1.8678

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free