AdaPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems

James Harrison; Animesh Garg; Boris Ivanovic; Yuke Zhu; Silvio Savarese; Li Fei-Fei; Marco Pavone

Book Chapter

AdaPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems

Springer Science and Business Media B.V., (2020), 437-453

DOI: 10.1007/978-3-030-28619-4_34

5Citations

55Readers

Get full text

Abstract

Model-free policy learning has enabled good performance on complex tasks that were previously intractable with traditional control techniques. However, this comes at the cost of requiring a perfectly accurate model for training. This is infeasible due to the very high sample complexity of model-free methods preventing training on the target system. This renders such methods unsuitable for physical systems. Model mismatch due to dynamics parameter differences and unmodeled dynamics error may cause suboptimal or unsafe behavior upon direct transfer. We introduce the Adaptive Policy Transfer for Stochastic Dynamics (AdaPT) algorithm that achieves provably safe and robust, dynamically-feasible zero-shot transfer of RL-policies to new domains with dynamics error. AdaPT combines the strengths of offline policy learning in a black-box source simulator with online tube-based MPC to attenuate bounded dynamics mismatch between the source and target dynamics. AdaPT allows online transfer of policies, trained solely in a simulation offline, to a family of unknown targets without fine-tuning. We also formally show that (i) AdaPT guarantees bounded state and control deviation through state-action tubes under relatively weak technical assumptions and, (ii) AdaPT results in a bounded loss of reward accumulation relative to a policy trained and evaluated in the source environment. We evaluate AdaPT on 2 continuous, non-holonomic simulated dynamical systems with 4 different disturbance models, and find that AdaPT performs between 50 and better on mean reward accrual than direct policy transfer.

Cite

CITATION STYLE

APA

Harrison, J., Garg, A., Ivanovic, B., Zhu, Y., Savarese, S., Fei-Fei, L., & Pavone, M. (2020). AdaPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems. In Springer Proceedings in Advanced Robotics (Vol. 10, pp. 437–453). Springer Science and Business Media B.V. https://doi.org/10.1007/978-3-030-28619-4_34

AdaPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems

Abstract

Cite

Register to see more suggestions