Top-down neural attention by excitation Backprop

112Citations
Citations of this article
109Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We aim to model the top-down attention of a Convolutional Neural Network (CNN) classifier for generating task-specific attention maps. Inspired by a top-down human visual attention model, we propose a new backpropagation scheme, called Excitation Backprop, to pass along top-down signals downwards in the network hierarchy via a probabilistic Winner-Take-All process. Furthermore, we introduce the concept of contrastive attention to make the top-down attention maps more discriminative. In experiments, we demonstrate the accuracy and generalizability of our method in weakly supervised localization tasks on the MS COCO, PASCAL VOC07 and ImageNet datasets. The usefulness of our method is further validated in the text-to-region association task. On the Flickr30k Entities dataset, we achieve promising performance in phrase localization by leveraging the top-down attention of a CNN model that has been trained on weakly labeled web images.

Cite

CITATION STYLE

APA

Zhang, J., Lin, Z., Brandt, J., Shen, X., & Sclaroff, S. (2016). Top-down neural attention by excitation Backprop. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9908 LNCS, pp. 543–559). Springer Verlag. https://doi.org/10.1007/978-3-319-46493-0_33

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free