A method for learning macro-actions for virtual characters using programming by demonstration and reinforcement learning

Yunsick Sung; Kyungeun Cho

Journal ArticleOPEN ACCESS

A method for learning macro-actions for virtual characters using programming by demonstration and reinforcement learning

Journal of Information Processing Systems (2012) 8(3) 409-420

DOI: 10.3745/JIPS.2012.8.3.409

3Citations

5Readers

Abstract

The decision-making by agents in games is commonly based on reinforcement learning. To improve the quality of agents, it is necessary to solve the problems of the time and state space that are required for learning. Such problems can be solved by Macro-Actions, which are defined and executed by a sequence of primitive actions. In this line of research, the learning time is reduced by cutting down the number of policy decisions by agents. Macro-Actions were originally defined as combinations of the same primitive actions. Based on studies that showed the generation of Macro-Actions by learning, Macro-Actions are now thought to consist of diverse kinds of primitive actions. However an enormous amount of learning time and state space are required to generate Macro-Actions. To resolve these issues, we can apply insights from studies on the learning of tasks through Programming by Demonstration (PbD) to generate Macro-Actions that reduce the learning time and state space. In this paper, we propose a method to define and execute Macro-Actions. Macro-Actions are learned from a human subject via PbD and a policy is learned by reinforcement learning. In an experiment, the proposed method was applied to a car simulation to verify the scalability of the proposed method. Data was collected from the driving control of a human subject, and then the Macro-Actions that are required for running a car were generated. Furthermore, the policy that is necessary for driving on a track was learned. The acquisition of Macro-Actions by PbD reduced the driving time by about 16% compared to the case in which Macro-Actions were directly defined by a human subject. In addition, the learning time was also reduced by a faster convergence of the optimum policies. © 2012 KIPS.

Author supplied keywords

Cite

CITATION STYLE

APA

Sung, Y., & Cho, K. (2012). A method for learning macro-actions for virtual characters using programming by demonstration and reinforcement learning. Journal of Information Processing Systems, 8(3), 409–420. https://doi.org/10.3745/JIPS.2012.8.3.409

A method for learning macro-actions for virtual characters using programming by demonstration and reinforcement learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions