Cooperative and competitive reinforcement and imitation learning for a mixture of heterogeneous learning modules

6Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

Abstract

This paper proposes Cooperative and competitive Reinforcement And Imitation Learning (CRAIL) for selecting an appropriate policy from a set of multiple heterogeneous modules and training all of them in parallel. Each learning module has its own network architecture and improves the policy based on an off-policy reinforcement learning algorithm and behavior cloning from samples collected by a behavior policy that is constructed by a combination of all the policies. Since the mixing weights are determined by the performance of the module, a better policy is automatically selected based on the learning progress. Experimental results on a benchmark control task show that CRAIL successfully achieves fast learning by allowing modules with complicated network structures to exploit task-relevant samples for training.

Cite

CITATION STYLE

APA

Uchibe, E. (2018). Cooperative and competitive reinforcement and imitation learning for a mixture of heterogeneous learning modules. Frontiers in Neurorobotics, 12(SEP). https://doi.org/10.3389/fnbot.2018.00061

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free