PlanLight: Learning to Optimize Traffic Signal Control with Planning and Iterative Policy Improvement

Huichu Zhang; Markos Kafouros; Yong Yu

Journal ArticleOPEN ACCESS

PlanLight: Learning to Optimize Traffic Signal Control with Planning and Iterative Policy Improvement

IEEE Access (2020) 8 219244-219255

DOI: 10.1109/ACCESS.2020.3041441

4Citations

19Readers

Abstract

Intelligent traffic signal control (TSC) is essential for transportation efficiency in modern road networks. There is an emerging trend of using deep reinforcement learning techniques to train TSC models in simulators for reducing trial-and-error in real-world scenarios, and recent studies have shown promising results. The target of TSC is to minimize the average travel time of a given area. However, it is impractical to directly optimize the target by setting the average travel time as the reward function due to its nature of feedback latency and difficulty on credit assignment. Existing methods often define the reward function in a heuristic way, which may cause a biased optimization on the real target as they only optimize the accumulative reward. In this work, we propose PlanLight, a novel planning-based TSC algorithm that learns from the demonstration of rollout algorithm, which obtains suboptimal control on the given target, through behavior cloning. We show the effectiveness and efficiency of the rollout algorithm in the multi-intersection control scenario. Moreover, we achieve further policy optimization by improving the base policy in the rollout procedure iteratively. Through comprehensive experiments, we demonstrate that PlanLight outperforms both conventional transportation approaches and existing learning-based methods in various sizes of traffic datasets. Furthermore, we empirically show the potential of PlanLight to be a general algorithm to obtain improvement on future state-of-the-art TSC methods.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhang, H., Kafouros, M., & Yu, Y. (2020). PlanLight: Learning to Optimize Traffic Signal Control with Planning and Iterative Policy Improvement. IEEE Access, 8, 219244–219255. https://doi.org/10.1109/ACCESS.2020.3041441

PlanLight: Learning to Optimize Traffic Signal Control with Planning and Iterative Policy Improvement

Abstract

Author supplied keywords

Cite

Register to see more suggestions