Algorithm for processing k-nearest join based on R-tree in MapReduce

Yi Liu; Ning Jing; Luo Chen; Wei Xiong

Journal ArticleOPEN ACCESS

Algorithm for processing k-nearest join based on R-tree in MapReduce

Ruan Jian Xue Bao/Journal of Software (2013) 24(8) 1836-1851

DOI: 10.3724/SP.J.1001.2013.04377

25Citations

8Readers

Abstract

To accelerate the k-nearest neighbor join (knnJ) query for large scale spatial data, the study presents a knnJ based on R-tree in MapReduce. First, the research uses the formalization of independent parallelism and sequential synchronization (IPSS) computation to abstract MapReduce parallel program model. Next, based on this parallel model abstraction, this paper proposes efficient algorithms for bulk building R-tree and performing knnJ query based on the constructed R-tree respectively. In the process of bulk building R-tree, a sampling algorithm is provided to determine the spatial partition function rapidly, which make the process of building R-tree conform to IPSS model and can be expressed easily in MapReduce. In the process of knnJ query, the knn expanded bounding box is introduced to limit the knn query range and partition data, and then the generated R-tree is used to execute knnJ query in parallel fashion, achieving high performance. This paper analyzes the communication and computation cost in theory. Experimental results and analysis in large real spatial data demonstrate that the algorithm can efficiently resolve the large scale knnJ spatial query in MapReduce environment, and has a good practical application. © 2013 ISCAS.

Author supplied keywords

Cite

CITATION STYLE

APA

Liu, Y., Jing, N., Chen, L., & Xiong, W. (2013). Algorithm for processing k-nearest join based on R-tree in MapReduce. Ruan Jian Xue Bao/Journal of Software, 24(8), 1836–1851. https://doi.org/10.3724/SP.J.1001.2013.04377

Algorithm for processing k-nearest join based on R-tree in MapReduce

Abstract

Author supplied keywords

Cite

Register to see more suggestions