This work investigates the negative effects of hubness on multimedia retrieval systems. Because of a problem of measuring distances in high-dimensional spaces, hub objects are close to an exceptionally large part of the data while anti-hubs are far away from all other data points. In the case of similarity based retrieval, hub objects are retrieved over and over again while anti-hubs are nonexistent in the retrieval lists. We investigate textual, image and music data and show how re-scaling methods can avoid the problem and decisively improve the overall retrieval quality. The observations of this work suggest to make hubness analysis an integral part when building a retrieval system. © 2014 Springer International Publishing Switzerland.
CITATION STYLE
Schnitzer, D., Flexer, A., & Tomašev, N. (2014). A case for hubness removal in high-dimensional multimedia retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8416 LNCS, pp. 687–692). Springer Verlag. https://doi.org/10.1007/978-3-319-06028-6_77
Mendeley helps you to discover research relevant for your work.