Modular production control using deep reinforcement learning: proximal policy optimization

20Citations
Citations of this article
45Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

EU regulations on CO2 limits and the trend of individualization are pushing the automotive industry towards greater flexibility and robustness in production. One approach to address these challenges is modular production, where workstations are decoupled by automated guided vehicles, requiring new control concepts. Modular production control aims at throughput-optimal coordination of products, workstations, and vehicles. For this np-hard problem, conventional control approaches lack in computing efficiency, do not find optimal solutions, or are not generalizable. In contrast, Deep Reinforcement Learning offers powerful and generalizable algorithms, able to deal with varying environments and high complexity. One of these algorithms is Proximal Policy Optimization, which is used in this article to address modular production control. Experiments in several modular production control settings demonstrate stable, reliable, optimal, and generalizable learning behavior. The agent successfully adapts its strategies with respect to the given problem configuration. We explain how to get to this learning behavior, especially focusing on the agent’s action, state, and reward design.

Cite

CITATION STYLE

APA

Mayer, S., Classen, T., & Endisch, C. (2021). Modular production control using deep reinforcement learning: proximal policy optimization. Journal of Intelligent Manufacturing, 32(8), 2335–2351. https://doi.org/10.1007/s10845-021-01778-z

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free