A new discrete-time iterative adaptive dynamic programming algorithm based on Q-learning

Qinglai Wei; Derong Liu

Conference Proceedings

A new discrete-time iterative adaptive dynamic programming algorithm based on Q-learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9377 LNCS 43-52

DOI: 10.1007/978-3-319-25393-0_6

1Citations

3Readers

Get full text

Abstract

In this paper, a novel Q-learning based policy iteration adaptive dynamic programming (ADP) algorithm is developed to solve the optimal control problems for discrete-time nonlinear systems. The idea is to use a policy iteration ADP technique to construct the iterative control law which stabilizes the system and simultaneously minimizes the iterative Q function. Convergence property is analyzed to show that the iterative Q function is monotonically non-increasing and converges to the solution of the optimality equation. Finally, simulation results are presented to show the performance of the developed algorithm.

Author supplied keywords

Cite

CITATION STYLE

APA

Wei, Q., & Liu, D. (2015). A new discrete-time iterative adaptive dynamic programming algorithm based on Q-learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9377 LNCS, pp. 43–52). Springer Verlag. https://doi.org/10.1007/978-3-319-25393-0_6

A new discrete-time iterative adaptive dynamic programming algorithm based on Q-learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions