Visual concept recognition and localization via iterative introspection

14Citations
Citations of this article
61Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Convolutional neural networks have been shown to develop internal representations, which correspond closely to semantically meaningful objects and parts, although trained solely on class labels. Class Activation Mapping (CAM) is a recent method that makes it possible to easily highlight the image regions contributing to a network’s classification decision. We build upon these two developments to enable a network to re-examine informative image regions, which we term introspection. We propose a weakly-supervised iterative scheme, which shifts its center of attention to increasingly discriminative regions as it progresses, by alternating stages of classification and introspection. We evaluate our method and show its effectiveness over a range of several datasets, where we obtain competitive or state-of-the-art results: on Stanford-40 Actions, we set a new state-of the art of 81.74%. On FGVC-Aircraft and the Stanford Dogs dataset, we show consistent improvements over baselines, some of which include significantly more supervision.

Cite

CITATION STYLE

APA

Rosenfeld, A., & Ullman, S. (2017). Visual concept recognition and localization via iterative introspection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10115 LNCS, pp. 264–279). Springer Verlag. https://doi.org/10.1007/978-3-319-54193-8_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free