Regularized fitted q-iteration for planning in continuous-space markovian decision problems

  • Farahmand A
  • Ghavamzadeh M
  • Szepesvári C
 et al. 
  • 36

    Readers

    Mendeley users who have this article in their library.
  • 28

    Citations

    Citations of this article.

Abstract

Reinforcement learning with linear and non-linear function approximation has been studied extensively in the last decade. However, as opposed to other fields of machine learning such as supervised learning, the effect of finite sample has not been thoroughly addressed within the reinforcement learning framework. In this paper we propose to use L2 regularization to control the complexity of the value function in reinforcement learning and planning problems. We consider the Regularized Fitted Q-Iteration algorithm and provide generalization bounds that account for small sample sizes. Finally, a realistic visual-servoing problem is used to illustrate the benefits of using the regularization procedure.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • Amir Massoud Farahmand

  • Mohammad Ghavamzadeh

  • Csaba Szepesvári

  • Shie Mannor

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free