This paper investigates the problem of image retrieval in abundant volume of image data. We propose an improved Content Based Image Retrieval (CBIR) system based on Apache Spark, a lightning-fast engine of cluster computing for large-scale data processing, to overcome the shortcomings in retrieval speed and accuracy. Specifically, binary descriptors, which consume less memory and accelerate the retrieval speed, are built through uniform sampling patterns in Binary Robust Invariant Scalable Keypoints (BRISK) to represent images instead of floating-number descriptors in the original SURF. Then we eliminate the mismatched point pairs with Random Sample Consensus (RANSAC) in the pre-matching point pairs to further improve the accuracy of the retrieval. Experimental results show that the proposed system significantly improves both the retrieval speed and accuracy compared to traditional CBIR systems.
CITATION STYLE
Huang, T., Yu, Z., Lin, X., Jiang, L., & Zhao, D. (2017). A distributed CBIR system based on improved SURF on apache spark. In Lecture Notes in Electrical Engineering (Vol. 449, pp. 147–155). Springer Verlag. https://doi.org/10.1007/978-981-10-6451-7_18
Mendeley helps you to discover research relevant for your work.