Speeding up q(λ)-learning

Marco Wiering; Jurgen Schmidhuber

Conference ProceedingsOPEN ACCESS

Speeding up q(λ)-learning

Wiering M
Schmidhuber J

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (1998) 1398 352-363

DOI: 10.1007/bfb0026706

1Citations

4Readers

Abstract

Q(λ)-learning uses TD(λ)-methods to accelerate Q-learning. The worst case complexity for a single update step of previous online Q(λ) implementations based on lookup-tables is bounded by the size of the state/action space. Our faster algorithm's worst case complexity is bounded by the number of actions. The algorithm is based on the observation that Q-value updates may be postponed until they are needed.

Cite

CITATION STYLE

APA

Wiering, M., & Schmidhuber, J. (1998). Speeding up q(λ)-learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1398, pp. 352–363). Springer Verlag. https://doi.org/10.1007/bfb0026706

Speeding up q(λ)-learning

Abstract

Cite

Register to see more suggestions