SliceNBound: Solving closest pairs and distance join queries in apache spark

4Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The (K) Closest-Pair(s) Query, KCPQ, consists in finding the (K) closest pair(s) of objects between two spatial datasets. Recently, several systems that enhance Apache Spark with spatial-awareness have been presented, providing a variety of queries for spatial computation, but not the KCPQ. Since queries are of different nature and one processing technique does not fit all cases, we need specialized algorithms for specific queries that exploit the power provided by parallel systems such as Apache Spark. This paper addresses the problem of answering the KCPQ in Apache Spark, by presenting such a specialized, fast algorithm that can easily be imported in any, spatial-oriented or general, Spark-based system. Furthermore, it presents a variant of this algorithm that solves the Distance Join Query. Experiments and comparison to other solutions indicate that our method is fast and efficient.

Cite

CITATION STYLE

APA

Mavrommatis, G., Moutafis, P., Vassilakopoulos, M., García-García, F., & Corral, A. (2017). SliceNBound: Solving closest pairs and distance join queries in apache spark. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10509 LNCS, pp. 199–213). Springer Verlag. https://doi.org/10.1007/978-3-319-66917-5_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free