Incremental estimation of visual vocabulary size for image retrieval

0Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The increasing amount of image databases over the last years has highlighted our need to represent an image collection efficiently and quickly. The majority of image retrieval and image clustering approaches has been based on the construction of a visual vocabulary in the so called Bag-of-Visual-words (BoV) model, analogous to the Bag-of-Words (BoW) model in the representation of a collection of text documents. A visual vocabulary (codebook) is constructed by clustering all available visual features in an image collection, using k-means or approximate kmeans, requiring as input the number of visual words, i.e. the size of the visual vocabulary, which is hard to be tuned or directly estimated by the total amount of visual descriptors. In order to avoid tuning or guessing the number of visual words, we propose an incremental estimation of the optimal visual vocabulary size, based on the DBSCAN-Martingale, which has been introduced in the context of text clustering and is able to estimate the number of clusters efficiently, even for very noisy datasets. For a sample of images, our method estimates the potential number of very dense SIFT patterns for each image in the collection. The proposed approach is evaluated in an image retrieval and in an image clustering task, by means of Mean Average Precision and Normalized Mutual Information.

Cite

CITATION STYLE

APA

Gialampoukidis, I., Vrochidis, S., & Kompatsiaris, I. (2017). Incremental estimation of visual vocabulary size for image retrieval. In Advances in Intelligent Systems and Computing (Vol. 529, pp. 29–38). Springer Verlag. https://doi.org/10.1007/978-3-319-47898-2_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free