Approximate dynamic programming and reinforcement learning

Lucian Buşoniu; Bart De Schutter; Robert Babuška

Journal Article

Approximate dynamic programming and reinforcement learning

Studies in Computational Intelligence (2010) 281 3-44

DOI: 10.1007/978-3-642-11688-9_1

15Citations

110Readers

Get full text

Abstract

Dynamic programming (DP) and reinforcement learning (RL) can be used to address problems from a variety of fields, including automatic control, artificial intelligence, operations research, and economy. Many problems in these fields are described by continuous variables, whereas DP and RL can find exact solutions only in the discrete case. Therefore, approximation is essential in practical DP and RL. This chapter provides an in-depth review of the literature on approximate DP and RL in large or continuous-space, infinite-horizon problems. Value iteration, policy iteration, and policy search approaches are presented in turn. Model-based (DP) as well as online and batch model-free (RL) algorithms are discussed. We review theoretical guarantees on the approximate solutions produced by these algorithms. Numerical examples illustrate the behavior of several representative algorithms in practice. Techniques to automatically derive value function approximators are discussed, and a comparison between value iteration, policy iteration, and policy search is provided. The chapter closes with a discussion of open issues and promising research directions in approximate DP and RL. © 2010 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Buşoniu, L., De Schutter, B., & Babuška, R. (2010). Approximate dynamic programming and reinforcement learning. Studies in Computational Intelligence, 281, 3–44. https://doi.org/10.1007/978-3-642-11688-9_1

Approximate dynamic programming and reinforcement learning

Abstract

Cite

Register to see more suggestions