Generalized policy iteration ADP for discrete-time nonlinear systems

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this chapter, generalized policy iteration (GPI) algorithms are developed to solve infinite-horizon optimal control problems for discrete-time nonlinear systems. GPI algorithms use the idea of interacting policy iteration and value iteration algorithms of adaptive dynamic programming (ADP). They permit an arbitrary positive semidefinite function to initialize the algorithm, where two revolving iterations are used for policy evaluation and policy improvement, respectively. Then, the monotonicity, convergence, admissibility, and optimality properties of the present GPI algorithms for discrete-time nonlinear systems are analyzed. For implementation of the GPI algorithms, neural networks are employed for approximating the iterative value functions and computing the iterative control laws, respectively, to obtain the approximate optimal control law. Simulation examples are included to verify the effectiveness of the present algorithm.

Cite

CITATION STYLE

APA

Liu, D., Wei, Q., Wang, D., Yang, X., & Li, H. (2017). Generalized policy iteration ADP for discrete-time nonlinear systems. In Advances in Industrial Control (pp. 177–221). Springer International Publishing. https://doi.org/10.1007/978-3-319-50815-3_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free