Learning a Low-Dimensional Representation of a Safe Region for Safe Reinforcement Learning on Dynamical Systems

Zhehua Zhou; Ozgur S. Oguz; Marion Leibold; Martin Buss

Journal ArticleOPEN ACCESS

Learning a Low-Dimensional Representation of a Safe Region for Safe Reinforcement Learning on Dynamical Systems

IEEE Transactions on Neural Networks and Learning Systems (2023) 34(5) 2513-2527

DOI: 10.1109/TNNLS.2021.3106818

16Citations

25Readers

Abstract

For the safe application of reinforcement learning algorithms to high-dimensional nonlinear dynamical systems, a simplified system model is used to formulate a safe reinforcement learning (SRL) framework. Based on the simplified system model, a low-dimensional representation of the safe region is identified and used to provide safety estimates for learning algorithms. However, finding a satisfying simplified system model for complex dynamical systems usually requires a considerable amount of effort. To overcome this limitation, we propose a general data-driven approach that is able to efficiently learn a low-dimensional representation of the safe region. By employing an online adaptation method, the low-dimensional representation is updated using the feedback data to obtain more accurate safety estimates. The performance of the proposed approach for identifying the low-dimensional representation of the safe region is illustrated using the example of a quadcopter. The results demonstrate a more reliable and representative low-dimensional representation of the safe region compared with previous works, which extends the applicability of the SRL framework.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhou, Z., Oguz, O. S., Leibold, M., & Buss, M. (2023). Learning a Low-Dimensional Representation of a Safe Region for Safe Reinforcement Learning on Dynamical Systems. IEEE Transactions on Neural Networks and Learning Systems, 34(5), 2513–2527. https://doi.org/10.1109/TNNLS.2021.3106818

Learning a Low-Dimensional Representation of a Safe Region for Safe Reinforcement Learning on Dynamical Systems

Abstract

Author supplied keywords

Cite

Register to see more suggestions