Learning complex behaviors via sequential composition and passivity-based control

15Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The model-free paradigm of Reinforcement learning (RL) is a theoretical strength. However in practice, the stringent assumptions required for optimal solutions (full state space exploration) and experimental issues, such as slow learning rates, render model-free RL a practical weakness. This paper addresses practical implementations of RL by interfacing elements of systems and control and robotics. In our approach space is handled by Sequential Composition (a technique commonly used in robotics) and time is handled by the use of passivity-based control methods (a standard nonlinear control approach) towards speeding up learning and providing a stopping time criteria. Sequential composition in effect partitions the state space and allows for the composition of controllers, each having different domains of attraction (DoA) and goal sets. This results in learning taking place in subsets of the state space. Passivity-based control (PBC) is a model-based control approach where total energy is computable. This total energy can be used as a candidate Lyapunov function to evaluate the stability of a controller and find estimates of its DoA. This enables learning in finite time: while learning the candidate Lyapunov function is monitored online to approximate the DoA of the learned controller. Once this DoA covers relevant states, from the point of view of sequential composition, the learning process is stopped. The result of this process is a collection of learned controllers that cover a desired range of the state space, and can be composed in sequence to achieve various desired goals. Optimality is lost in favour of practicality. Other implications include safety while learning and incremental learning.

Cite

CITATION STYLE

APA

Lopes, G. A. D., Najafi, E., Nageshrao, S. P., & Babuška, R. (2015). Learning complex behaviors via sequential composition and passivity-based control. In Studies in Systems, Decision and Control (Vol. 42, pp. 53–74). Springer International Publishing. https://doi.org/10.1007/978-3-319-26327-4_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free