Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.
Marco, A., Hennig, P., Schaal, S., & Trimpe, S. (2018). On the design of LQR kernels for efficient controller learning. In 2017 IEEE 56th Annual Conference on Decision and Control, CDC 2017 (Vol. 2018-January, pp. 5193–5200). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/CDC.2017.8264429