In a reinforcement learning setting, the goal of transfer learn-ing is to improve performance on a target task by re-using knowledge from one or more source tasks. A key problem in transfer learning is how to choose appropriate source tasks for a given target task. Current approaches typically require that the agent has some experience in the target domain, or that the target task is specified by a model (e.g., a Markov Decision Process) with known parameters. To address these limitations, this paper proposes a framework for selecting source tasks in the absence of a known model or target task samples. Instead, our approach uses meta-data (e.g., attribute-value pairs) associated with each task to learn the expected benefit of transfer given a source-target task pair. To test the method, we conducted a large-scale experiment in the Ms. Pac-Man domain in which an agent played over 170 million games spanning 192 variations of the task. The agent used vast amounts of experience about transfer learn-ing in the domain to model the benefit (or detriment) of transferring knowledge from one task to another. Subse-quently, the agent successfully selected appropriate source tasks for previously unseen target tasks.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below