Asynchronous stochastic approximation and Q-learning

15Citations
Citations of this article
93Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We provide some general results on the convergence of a class of stochastic approximation algorithms and their parallel and asynchronous variants. We then use these results to study the Q-learning algorithm, a reinforcement learning method for solving Markov decision problems, and establish its convergence under conditions more general than previously available.

Cite

CITATION STYLE

APA

Tsitsiklis, J. N. (1993). Asynchronous stochastic approximation and Q-learning. In Proceedings of the IEEE Conference on Decision and Control (Vol. 1, pp. 395–400). IEEE. https://doi.org/10.1007/bf00993306

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free