This paper presents a corpus of deep features extracted from the YFCC100M images considering the fc6 hidden layer activation of the HybridNet deep convolutional neural network. For a set of random selected queries we made available k-NN results obtained sequentially scanning the entire set features comparing both using the Euclidean and Hamming Distance on a binarized version of the features. This set of results is ground truth for evaluating Content-Based Image Retrieval (CBIR) systems that use approximate similarity search methods for efficient and scalable indexing. Moreover, we present experimental results obtained indexing this corpus with two distinct approaches: the Metric Inverted File and the Lucene Quantization. These two CBIR systems are public available online allowing real-time search using both internal and external queries.
Amato, G., Falchi, F., Gennaro, C., & Rabitti, F. (2016). YFCC100M HybridNet fc6 deep features for content-based image retrieval. In MMCommons 2016 - Proceedings of the 2016 ACM Workshop on the Multimedia COMMONS, co-located with ACM Multimedia 2016 (pp. 11–18). Association for Computing Machinery, Inc. https://doi.org/10.1145/2983554.2983557