Modular production control using deep reinforcement learning: proximal policy optimization

Sebastian Mayer; Tobias Classen; Christian Endisch

Journal ArticleOPEN ACCESS

Modular production control using deep reinforcement learning: proximal policy optimization

Journal of Intelligent Manufacturing (2021) 32(8) 2335-2351

DOI: 10.1007/s10845-021-01778-z

20Citations

45Readers

Abstract

EU regulations on CO2 limits and the trend of individualization are pushing the automotive industry towards greater flexibility and robustness in production. One approach to address these challenges is modular production, where workstations are decoupled by automated guided vehicles, requiring new control concepts. Modular production control aims at throughput-optimal coordination of products, workstations, and vehicles. For this np-hard problem, conventional control approaches lack in computing efficiency, do not find optimal solutions, or are not generalizable. In contrast, Deep Reinforcement Learning offers powerful and generalizable algorithms, able to deal with varying environments and high complexity. One of these algorithms is Proximal Policy Optimization, which is used in this article to address modular production control. Experiments in several modular production control settings demonstrate stable, reliable, optimal, and generalizable learning behavior. The agent successfully adapts its strategies with respect to the given problem configuration. We explain how to get to this learning behavior, especially focusing on the agent’s action, state, and reward design.

Author supplied keywords

Cite

CITATION STYLE

APA

Mayer, S., Classen, T., & Endisch, C. (2021). Modular production control using deep reinforcement learning: proximal policy optimization. Journal of Intelligent Manufacturing, 32(8), 2335–2351. https://doi.org/10.1007/s10845-021-01778-z

Modular production control using deep reinforcement learning: proximal policy optimization

Abstract

Author supplied keywords

Cite

Register to see more suggestions