Deep Reinforcement Learning Under Signal Temporal Logic Constraints Using Lagrangian Relaxation

Junya Ikemoto; Toshimitsu Ushio

Journal ArticleOPEN ACCESS

Deep Reinforcement Learning Under Signal Temporal Logic Constraints Using Lagrangian Relaxation

IEEE Access (2022) 10 114814-114828

DOI: 10.1109/ACCESS.2022.3218216

7Citations

9Readers

Abstract

Deep reinforcement learning (DRL) has attracted much attention as an approach to solve optimal control problems without mathematical models of systems. On the other hand, in general, constraints may be imposed on optimal control problems. In this study, we consider the optimal control problems with constraints to complete temporal control tasks. We describe the constraints using signal temporal logic (STL), which is useful for time sensitive control tasks since it can specify continuous signals within bounded time intervals. To deal with the STL constraints, we introduce an extended constrained Markov decision process (CMDP), which is called a τ-CMDP. We formulate the STL-constrained optimal control problem as the τ-CMDP and propose a two-phase constrained DRL algorithm using the Lagrangian relaxation method. Through simulations, we also demonstrate the learning performance of the proposed algorithm.

Author supplied keywords

Cite

CITATION STYLE

APA

Ikemoto, J., & Ushio, T. (2022). Deep Reinforcement Learning Under Signal Temporal Logic Constraints Using Lagrangian Relaxation. IEEE Access, 10, 114814–114828. https://doi.org/10.1109/ACCESS.2022.3218216

Deep Reinforcement Learning Under Signal Temporal Logic Constraints Using Lagrangian Relaxation

Abstract

Author supplied keywords

Cite

Register to see more suggestions