Learning replanning policies with direct policy search

Florian Brandherm; Jan Peters; Gerhard Neumann; Riad Akrour

Journal ArticleOPEN ACCESS

Learning replanning policies with direct policy search

IEEE Robotics and Automation Letters (2019) 4(2) 2196-2203

DOI: 10.1109/LRA.2019.2901656

2Citations

16Readers

Abstract

Direct policy search has been successful in learning challenging real-world robotic motor skills by learning open-loop movement primitives with high sample efficiency. These primitives can be generalized to different contexts with varying initial configurations and goals. Current state-of-the-art contextual policy search algorithms can however not adapt to changing, noisy context measurements. Yet, these are common characteristics of real-world robotic tasks. Planning a trajectory ahead based on an inaccurate context that may change during the motion often results in poor accuracy, especially with highly dynamical tasks. To adapt to updated contexts, it is sensible to learn trajectory replanning strategies. We propose a framework to learn trajectory replanning policies via contextual policy search and demonstrate that they are safe for the robot, can be learned efficiently, and outperform non-replanning policies for problems with partially observable or perturbed context.

Author supplied keywords

Cite

CITATION STYLE

APA

Brandherm, F., Peters, J., Neumann, G., & Akrour, R. (2019). Learning replanning policies with direct policy search. IEEE Robotics and Automation Letters, 4(2), 2196–2203. https://doi.org/10.1109/LRA.2019.2901656

Learning replanning policies with direct policy search

Abstract

Author supplied keywords

Cite

Register to see more suggestions