Multigrid reinforcement learning with reward shaping in Artificial Neural Networks, ICANN 2008

  • Grześ M
  • Kudenko D
ISSN: 0302-9743
N/ACitations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

Potential-based reward shaping has been shown to be a powerful method to improve the convergence rate of reinforcement learning agents. It is a flexible technique to incorporate background knowledge into temporal-difference learning in a principled way. However, the question remains how to compute the potential which is used to shape the reward that is given to the learning agent. In this paper we propose a way to solve this problem in reinforcement learning with state space discretisation. In particular, we show that the potential function can be learned online in parallel with the actual reinforcement learning process. If the Q-function is learned for states determined by a given grid, a V-function for states with lower resolution can be learned in parallel and used to approximate the potential for ground learning. The novel algorithm is presented and experimentally evaluated.

Cite

CITATION STYLE

APA

Grześ, M., & Kudenko, D. (2008). Multigrid reinforcement learning with reward shaping in Artificial Neural Networks, ICANN 2008. Artificial Neural Networks - ICANN 2008, 5163(September), 357–366. Retrieved from http://www.springerlink.com/index/10.1007/978-3-540-87536-9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free