Value iteration networks

Aviv Tamar; Yi Wu; Garrett Thomas; Sergey Levine; Pieter Abbeel

Conference ProceedingsOPEN ACCESS

Value iteration networks

IJCAI International Joint Conference on Artificial Intelligence (2017) 0 4949-4953

DOI: 10.24963/ijcai.2017/700

25Citations

1.4kReaders

Abstract

We introduce the value iteration network (VIN): a fully differentiable neural network with a 'planning module' embedded within. VINs can learn to plan, and are suitable for predicting outcomes that involve planning-based reasoning, such as policies for reinforcement learning. Key to our approach is a novel differentiable approximation of the value-iteration algorithm, which can be represented as a convolutional neural network, and trained end-to-end using standard backpropagation. We evaluate VIN based policies on discrete and continuous path-planning domains, and on a natural-language based search task. We show that by learning an explicit planning computation, VIN policies generalize better to new, unseen domains. This paper is a significantly abridged and IJCAI audience targeted version of the original NIPS 2016 paper with the same title, available here: https://arxiv.org/abs/1602.02867.

Cite

CITATION STYLE

APA

Tamar, A., Wu, Y., Thomas, G., Levine, S., & Abbeel, P. (2017). Value iteration networks. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 0, pp. 4949–4953). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2017/700

Value iteration networks

Abstract

Cite

Register to see more suggestions