Quicker similarity joins in metric spaces

Kimmo Fredriksson; Billy Braithwaite

Conference Proceedings

Quicker similarity joins in metric spaces

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 8199 LNCS 127-140

DOI: 10.1007/978-3-642-41062-8_13

13Citations

3Readers

Get full text

Abstract

We consider the join operation in metric spaces. Given two sets A and B of objects drawn from some universe double-struck U, we want to compute the set A times sign closed B = {(a, b) ∈ A x B | d(a, b) ≤ r} efficiently, where d : double-struck U x double-struck U → ℝ+ is a metric distance function and r ∈ ℝ+ is user supplied query radius. In particular we are interested in the case where we have no index available (nor we can afford to build it) for either A or B. In this paper we improve the Quickjoin algorithm (Jacox and Samet, 2008), based on the well-know Quicksort algorithm, by (i) replacing the low level component that handles small subsets with essentially brute-force nested loop with a more efficient method; (ii) showing that, contrary to Quicksort, in Quickjoin unbalanced partitioning can improve the algorithm; and (iii) making the algorithm probabilistic while still obtaining most of the relevant results. We also show how to use Quickjoin to compute k-nearest neighbor joins. The experimental results show that the method works well in practice. © 2013 Springer-Verlag.

Cite

CITATION STYLE

APA

Fredriksson, K., & Braithwaite, B. (2013). Quicker similarity joins in metric spaces. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8199 LNCS, pp. 127–140). https://doi.org/10.1007/978-3-642-41062-8_13

Quicker similarity joins in metric spaces

Abstract

Cite

Register to see more suggestions