In order to become an effective complement to traditional Web-scale<br />text-based image retrieval solutions, content-based image retrieval<br />must address scalability and efficiency issues. In this paper we<br />investigate the possibility of caching the answers to content-based<br />image retrieval queries in metric space, with the aim of reducing<br />the average cost of query processing, and boosting the overall system<br />throughput. Our proposal exploits the similarity between the query<br />object and the cache content, and allows the cache to return approximate<br />answers with acceptable quality guarantee even if the query processed<br />has never been encountered in the past. Moreover, since popular images<br />that are likely to be used as query have several near-duplicate versions,<br />we show that our caching algorithm is robust, and does not suffer<br />of cache pollution problems due to near-duplicate query objects.<br />We report on very promising results obtained with a collection of<br />one million high-quality digital photos. We show that it is worth<br />pursuing caching strategies also in similarity search systems, since<br />the proposed caching techniques can have a significant impact on<br />performance, like caching on text queries has been proven effective<br />for traditional Web search engines. Copyright 2009 ACM.
Falchi, F., Lucchese, C., Orlando, S., Perego, R., & Rabitti, F. (2009). Caching content-based queries for robust and efficient image retrieval. In Proceedings of the 12th International Conference on Extending Database Technology Advances in Database Technology - EDBT ’09 (p. 780). New York, New York, USA: ACM Press. https://doi.org/10.1145/1516360.1516450