In this paper we propose a generic framework for the optimization of image feature encoders for image retrieval. Our approach uses a triplet-based objective that compares, for a given query image, the similarity scores of an image with a matching and a non-matching image, penalizing triplets that give a higher score to the non-matching image. We use stochastic gradient descent to address the resulting problem and provide the required gradient expressions for generic encoder parameters, applying the resulting algorithm to learn the power normalization parameters commonly used to condition image features. We also propose a modification to codebook-based feature encoders that consists of weighting the local descriptors as a function of their distance to the assigned codeword before aggregating them as part of the encoding process. Using the VLAD feature encoder, we show experimentally that our proposed optimized power normalizationmethod and local descriptor weighting method yield improvements on a standard dataset.
CITATION STYLE
Rana, A., Zepeda, J., & Perez, P. (2015). Feature learning for the image retrieval task. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9010, pp. 152–165). Springer Verlag. https://doi.org/10.1007/978-3-319-16634-6_12
Mendeley helps you to discover research relevant for your work.