Object-based aggregation of deep features for image retrieval

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In content-based visual image retrieval, image representation is one of the fundamental issues in improving retrieval performance. Recently Convolutional Neural Network (CNN) features have shown their great success as a universal representation. However, the deep CNN features lack invariance to geometric transformations and object compositions, which limits their robustness for scene image retrieval. Since a scene image always is composed of multiple objects which are crucial components to understand and describe the scene, in this paper we propose an object-based aggregation method over the CNN features for obtaining an invariant and compact image representation for image retrieval. The proposed method represents an image through VLAD pooling of CNN features describing the underlying objects, which make the representation robust to spatial layout of objects in the scene and invariant to general geometric transformations. We evaluate the performance of the proposed method on three public ground-truth datasets by comparing with state-of-the-art approaches and promising improvements have been achieved.

Cite

CITATION STYLE

APA

Bao, Y., & Li, H. (2017). Object-based aggregation of deep features for image retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10132 LNCS, pp. 478–489). Springer Verlag. https://doi.org/10.1007/978-3-319-51811-4_39

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free