A new discrete-time iterative adaptive dynamic programming algorithm based on Q-learning

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, a novel Q-learning based policy iteration adaptive dynamic programming (ADP) algorithm is developed to solve the optimal control problems for discrete-time nonlinear systems. The idea is to use a policy iteration ADP technique to construct the iterative control law which stabilizes the system and simultaneously minimizes the iterative Q function. Convergence property is analyzed to show that the iterative Q function is monotonically non-increasing and converges to the solution of the optimality equation. Finally, simulation results are presented to show the performance of the developed algorithm.

Cite

CITATION STYLE

APA

Wei, Q., & Liu, D. (2015). A new discrete-time iterative adaptive dynamic programming algorithm based on Q-learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9377 LNCS, pp. 43–52). Springer Verlag. https://doi.org/10.1007/978-3-319-25393-0_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free