Advice-based exploration in model-based reinforcement learning

Rodrigo Toro Icarte; Toryn Q. Klassen; Richard Anthony Valenzano; Sheila A. McIlraith

Conference Proceedings

Advice-based exploration in model-based reinforcement learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 10832 LNAI 72-83

DOI: 10.1007/978-3-319-89656-4_6

13Citations

19Readers

Get full text

Abstract

Convergence to an optimal policy using model-based reinforcement learning can require significant exploration of the environment. In some settings such exploration is costly or even impossible, such as in cases where simulators are not available, or where there are prohibitively large state spaces. In this paper we examine the use of advice to guide the search for an optimal policy. To this end we propose a rich language for providing advice to a reinforcement learning agent. Unlike constraints which potentially eliminate optimal policies, advice offers guidance for the exploration, while preserving the guarantee of convergence to an optimal policy. Experimental results on deterministic grid worlds demonstrate the potential for good advice to reduce the amount of exploration required to learn a satisficing or optimal policy, while maintaining robustness in the face of incomplete or misleading advice.

Author supplied keywords

Cite

CITATION STYLE

APA

Toro Icarte, R., Klassen, T. Q., Valenzano, R. A., & McIlraith, S. A. (2018). Advice-based exploration in model-based reinforcement learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10832 LNAI, pp. 72–83). Springer Verlag. https://doi.org/10.1007/978-3-319-89656-4_6

Advice-based exploration in model-based reinforcement learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions