In this paper, we present YFCC100M-HNfc6, a benchmark consisting of 97M deep features extracted from the Yahoo Creative Commons 100M (YFCC100M) dataset. Three type of features were extracted using a state-of-the-art Convolutional Neural Network trained on the ImageNet and Places datasets. Together with the features, we made publicly available a set of 1, 000 queries and k-NN results obtained by sequential scan. We first report detailed statistical information on both the features and search results. Then, we show an example of performance evaluation, performed using this benchmark, on the MI-File approximate similarity access method.
Amato, G., Falchi, F., Gennaro, C., & Rabitti, F. (2016). YFCC100M-HNFC6: A large-scale deep features benchmark for similarity search. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9939 LNCS, pp. 196–209). Springer Verlag. https://doi.org/10.1007/978-3-319-46759-7_15