Model-free optimal tracking control for discrete-time system with delays using reinforcement Q-learning

14Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

Reinforcement Q-learning algorithm for the optimal tracking control problem with unknown dynamics and delays is proposed. Traditional reinforcement learning methods require an accurate system model, which is avoided by means of the Q-learning method. This is very meaningful in practical implementation because all or part of the model of the system is often difficult to obtain or requires an additional high cost. First, the augmented system composed of the original system and reference trajectory is constructed, then the corresponding augmented linear quadratic tracking (LQT) Bellman equation is derived. Based on this, the reinforcement Q-learning algorithm is presented at the end. To implement this method, the iteration equations are solved online by using the least squares technique.

Cite

CITATION STYLE

APA

Liu, Y., & Yu, R. (2018). Model-free optimal tracking control for discrete-time system with delays using reinforcement Q-learning. Electronics Letters, 54(12), 750–752. https://doi.org/10.1049/el.2017.3238

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free