Algorithm for processing k-nearest join based on R-tree in MapReduce

25Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

To accelerate the k-nearest neighbor join (knnJ) query for large scale spatial data, the study presents a knnJ based on R-tree in MapReduce. First, the research uses the formalization of independent parallelism and sequential synchronization (IPSS) computation to abstract MapReduce parallel program model. Next, based on this parallel model abstraction, this paper proposes efficient algorithms for bulk building R-tree and performing knnJ query based on the constructed R-tree respectively. In the process of bulk building R-tree, a sampling algorithm is provided to determine the spatial partition function rapidly, which make the process of building R-tree conform to IPSS model and can be expressed easily in MapReduce. In the process of knnJ query, the knn expanded bounding box is introduced to limit the knn query range and partition data, and then the generated R-tree is used to execute knnJ query in parallel fashion, achieving high performance. This paper analyzes the communication and computation cost in theory. Experimental results and analysis in large real spatial data demonstrate that the algorithm can efficiently resolve the large scale knnJ spatial query in MapReduce environment, and has a good practical application. © 2013 ISCAS.

Cite

CITATION STYLE

APA

Liu, Y., Jing, N., Chen, L., & Xiong, W. (2013). Algorithm for processing k-nearest join based on R-tree in MapReduce. Ruan Jian Xue Bao/Journal of Software, 24(8), 1836–1851. https://doi.org/10.3724/SP.J.1001.2013.04377

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free