Generalized policy iteration ADP for discrete-time nonlinear systems

Derong Liu; Qinglai Wei; Ding Wang; Xiong Yang; Hongliang Li

Book Chapter

Generalized policy iteration ADP for discrete-time nonlinear systems

Springer International Publishing, (2017), 177-221

DOI: 10.1007/978-3-319-50815-3_5

0Citations

3Readers

Get full text

Abstract

In this chapter, generalized policy iteration (GPI) algorithms are developed to solve infinite-horizon optimal control problems for discrete-time nonlinear systems. GPI algorithms use the idea of interacting policy iteration and value iteration algorithms of adaptive dynamic programming (ADP). They permit an arbitrary positive semidefinite function to initialize the algorithm, where two revolving iterations are used for policy evaluation and policy improvement, respectively. Then, the monotonicity, convergence, admissibility, and optimality properties of the present GPI algorithms for discrete-time nonlinear systems are analyzed. For implementation of the GPI algorithms, neural networks are employed for approximating the iterative value functions and computing the iterative control laws, respectively, to obtain the approximate optimal control law. Simulation examples are included to verify the effectiveness of the present algorithm.

Cite

CITATION STYLE

APA

Liu, D., Wei, Q., Wang, D., Yang, X., & Li, H. (2017). Generalized policy iteration ADP for discrete-time nonlinear systems. In Advances in Industrial Control (pp. 177–221). Springer International Publishing. https://doi.org/10.1007/978-3-319-50815-3_5

Generalized policy iteration ADP for discrete-time nonlinear systems

Abstract

Cite

Register to see more suggestions