Introduction

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In reinforcement learning [Sutton and Barto (1998)] (RL) problems, learning agents execute sequential actions with the goal of maximizing a reward signal, which may be time-delayed. For example, an agent could learn to play a game by being told whether it wins or loses, without ever being told the "correct" action. The RL framework has gained popularity with the development of algorithms capable of mastering increasingly complex problems. However, when RL agents begin learning tabula rasa, mastering difficult tasks is often slow or infeasible, and thus a significant amount of current research in RL focuses on improving the speed of learning by exploiting domain expertise with varying amounts of human-provided knowledge. Common approaches include deconstructing the task into a hierarchy of subtasks (c.f.,MAXQ [Dietterich (2000)]), finding ways to learn over temporally abstract actions (e.g., using the options framework [Sutton et al. (1999)]) rather than simple one-step actions, and abstracting over the state space (e.g., via function approximation [Sutton and Barto (1998)]) so agents may efficiently generalize experience. © 2009 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Taylor, M. E. (2009). Introduction. Studies in Computational Intelligence. https://doi.org/10.1007/978-3-642-01882-4_1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free