Model-Based Safe Reinforcement Learning with Time-Varying Constraints: Applications to Intelligent Vehicles

Xinglong Zhang; Yaoqian Peng; Biao Luo; Wei Pan; Xin Xu; Haibin Xie

Journal ArticleOPEN ACCESS

Model-Based Safe Reinforcement Learning with Time-Varying Constraints: Applications to Intelligent Vehicles

IEEE Transactions on Industrial Electronics (2024) 71(10) 12744-12753

DOI: 10.1109/TIE.2023.3317853

21Citations

11Readers

Abstract

In recent years, safe reinforcement learning (RL) with the actor-critic structure has gained significant interest for continuous control tasks. However, achieving near-optimal control policies with safety and convergence guarantees remains challenging. Moreover, few works have focused on designing RL algorithms that handle time-varying safety constraints. This article proposes a safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints. The algorithm's novelty lies in two key aspects. Firstly, the approach introduces a unique barrier force-based control policy structure to ensure control safety during learning. Secondly, a multistep policy evaluation mechanism is employed, enabling the prediction of policy safety risks under time-varying constraints and guiding safe updates. Theoretical results on learning convergence, stability, and robustness are proven. The proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment. It is also applied to the real-world problem of integrated path following and collision avoidance for two intelligent vehicles - a differential-drive vehicle and an Ackermann-drive one. The experimental results demonstrate the impressive sim-to-real transfer capability of our approach, while showcasing satisfactory online control performance.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhang, X., Peng, Y., Luo, B., Pan, W., Xu, X., & Xie, H. (2024). Model-Based Safe Reinforcement Learning with Time-Varying Constraints: Applications to Intelligent Vehicles. IEEE Transactions on Industrial Electronics, 71(10), 12744–12753. https://doi.org/10.1109/TIE.2023.3317853

Model-Based Safe Reinforcement Learning with Time-Varying Constraints: Applications to Intelligent Vehicles

Abstract

Author supplied keywords

Cite

Register to see more suggestions