E2Net: Excitative-Expansile Learning for Weakly Supervised Object Localization

11Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Weakly supervised object localization (WSOL) has gained recent popularity, which seeks to train localizers with only image-level labels. However, due to relying heavily on classification objective for training, prevailing WSOL methods only localize discriminative parts of object, ignoring other useful information, such as the wings of a bird, and suffer from severe rotation variations. Moreover, learning object localization imposes CNNs to attend non-salient regions under weak supervision, which may negatively influence image classification results. To address these challenges, this paper proposes a novel end-to-end Excitation-Expansion network, coined as E$-2$Net, to localize entire objects with only image-level labels, which served as the base of most multimedia tasks. The proposed E$-2$Net consists of two key components: Maxout-Attention Excitation (MAE) and Orientation-Sensitive Expansion (OSE). Firstly, MAE module aims to activate non-discriminative localization features while simultaneously recovering discriminative classification cues. To this end, we couple erasing strategy with maxout learning efficiently to facilitate entire-object localization without hurting classification accuracy. Secondly, to address rotation variations, the proposed OSE module expands less salient object parts along with all possible orientations. Particularly, OSE module dynamically combines selective attention banks from various orientated expansions of receptive-field, which introduces additional multi-parallel localization heads. Extensive experiments on ILSVRC 2012 and CUB-200-2011 demonstrate that the proposed E$-2$Net outperforms the previous state-of-the-art WSOL methods and also significantly improves classification performance.

Cite

CITATION STYLE

APA

Chen, Z., Cao, L., Shen, Y., Lian, F., Wu, Y., & Ji, R. (2021). E2Net: Excitative-Expansile Learning for Weakly Supervised Object Localization. In MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia (pp. 573–581). Association for Computing Machinery, Inc. https://doi.org/10.1145/3474085.3475211

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free