Using process data to generate an optimal control policy via apprenticeship and reinforcement learning

33Citations
Citations of this article
26Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Reinforcement learning (RL) is a data-driven approach to synthesizing an optimal control policy. A barrier to wide implementation of RL-based controllers is its data-hungry nature during online training and its inability to extract useful information from human operator and historical process operation data. Here, we present a two-step framework to resolve this challenge. First, we employ apprenticeship learning via inverse RL to analyze historical process data for synchronous identification of a reward function and parameterization of the control policy. This is conducted offline. Second, the parameterization is improved online efficiently under the ongoing process via RL within only a few iterations. Significant advantages of this framework include to allow for the hot-start of RL algorithms for process optimal control, and robust abstraction of existing controllers and control knowledge from data. The framework is demonstrated on three case studies, showing its potential for chemical process control.

References Powered by Scopus

Long Short-Term Memory

78506Citations
N/AReaders
Get full text

Human-level control through deep reinforcement learning

23131Citations
N/AReaders
Get full text

Information theory and statistical mechanics

9153Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Machine learning for biochemical engineering: A review

122Citations
N/AReaders
Get full text

Deep reinforcement learning with shallow controllers: An experimental application to PID tuning

76Citations
N/AReaders
Get full text

Advancements in Humanoid Robots: A Comprehensive Review and Future Prospects

60Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Mowbray, M., Smith, R., Del Rio-Chanona, E. A., & Zhang, D. (2021). Using process data to generate an optimal control policy via apprenticeship and reinforcement learning. AIChE Journal, 67(9). https://doi.org/10.1002/aic.17306

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 11

65%

Researcher 3

18%

Lecturer / Post doc 2

12%

Professor / Associate Prof. 1

6%

Readers' Discipline

Tooltip

Engineering 6

40%

Chemical Engineering 5

33%

Arts and Humanities 3

20%

Social Sciences 1

7%

Save time finding and organizing research with Mendeley

Sign up for free