Policy search for path integral control

Vicenç Gómez; Hilbert J. Kappen; Jan Peters; Gerhard Neumann

Conference ProceedingsOPEN ACCESS

Policy search for path integral control

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8724 LNAI(PART 1) 482-497

DOI: 10.1007/978-3-662-44848-9_31

26Citations

59Readers

Abstract

Path integral (PI) control defines a general class of control problems for which the optimal control computation is equivalent to an inference problem that can be solved by evaluation of a path integral over state trajectories. However, this potential is mostly unused in real-world problems because of two main limitations: first, current approaches can typically only be applied to learn open-loop controllers and second, current sampling procedures are inefficient and not scalable to high dimensional systems. We introduce the efficient Path Integral Relative-Entropy Policy Search (PI-REPS) algorithm for learning feedback policies with PI control. Our algorithm is inspired by information theoretic policy updates that are often used in policy search. We use these updates to approximate the state trajectory distribution that is known to be optimal from the PI control theory. Our approach allows for a principled treatment of different sampling distributions and can be used to estimate many types of parametric or non-parametric feedback controllers. We show that PI-REPS significantly outperforms current methods and is able to solve tasks that are out of reach for current methods. © 2014 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Gómez, V., Kappen, H. J., Peters, J., & Neumann, G. (2014). Policy search for path integral control. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8724 LNAI, pp. 482–497). Springer Verlag. https://doi.org/10.1007/978-3-662-44848-9_31

Policy search for path integral control

Abstract

Author supplied keywords

Cite

Register to see more suggestions