Emergent Policy Discovery for Visual Reinforcement Learning Through Tangled Program Graphs: A Tutorial

Stephen Kelly; Robert J. Smith; Malcolm I. Heywood

Book Chapter

Emergent Policy Discovery for Visual Reinforcement Learning Through Tangled Program Graphs: A Tutorial

Kelly S
Smith R
Heywood M

DOI: 10.1007/978-3-030-04735-1_3

N/ACitations

8Readers

Get full text

Abstract

Tangled Program Graphs (TPG) represents a framework by which multiple programs can be organized to cooperate and decompose a task with minimal a priori information. TPG agents begin with least complexity and incrementally coevolve to discover a complexity befitting the nature of the task. Previous research has demonstrated the TPG framework under visual reinforcement learning tasks from the Arcade Learning Environment and VizDoom first person shooter game that are competitive with those from Deep Learning. However, unlike Deep Learning the emergent constructive properties of TPG results in solutions that are orders of magnitude simpler, thus execution never needs hardware support. In this work, our goal is to provide a tutorial overview demonstrating how the emergent properties of TPG have been achieved as well as providing specific examples of decompositions discovered under the VizDoom task.

Cite

CITATION STYLE

APA

Kelly, S., Smith, R. J., & Heywood, M. I. (2019). Emergent Policy Discovery for Visual Reinforcement Learning Through Tangled Program Graphs: A Tutorial (pp. 37–57). https://doi.org/10.1007/978-3-030-04735-1_3

Emergent Policy Discovery for Visual Reinforcement Learning Through Tangled Program Graphs: A Tutorial

Abstract

Cite

Register to see more suggestions