Image collections may contain multiple copies, versions, and fragments of the same image. Storage or retrieval of such duplicates and near-duplicates may be unnecessary and, in the context of collections derived from the web, their presence may represent infringements of copyright. However, identifying image versions is a challenging problem, as they can be subject to a wide range of digital alterations, and is potentially costly as the number of image pairs to be considered is quadratic in collection size. In this paper, we propose a method for finding the pairs of near-duplicates based on manipulation of an image index. Our approach is an adaptation of a robust object recognition technique and a near-duplicate document detection algorithm to this application domain. We show that this method requires only moderate computing resources, and is highly effective at identifying pairs of near-duplicates. © Springer-Verlag Berlin Heidelberg 2007.
CITATION STYLE
Foo, J. J., Sinha, R., & Zobel, J. (2007). Discovery of image versions in large collections. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4352 LNCS, pp. 433–442). https://doi.org/10.1007/978-3-540-69429-8_44
Mendeley helps you to discover research relevant for your work.