Quicker similarity joins in metric spaces

13Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We consider the join operation in metric spaces. Given two sets A and B of objects drawn from some universe double-struck U, we want to compute the set A times sign closed B = {(a, b) ∈ A x B | d(a, b) ≤ r} efficiently, where d : double-struck U x double-struck U → ℝ+ is a metric distance function and r ∈ ℝ+ is user supplied query radius. In particular we are interested in the case where we have no index available (nor we can afford to build it) for either A or B. In this paper we improve the Quickjoin algorithm (Jacox and Samet, 2008), based on the well-know Quicksort algorithm, by (i) replacing the low level component that handles small subsets with essentially brute-force nested loop with a more efficient method; (ii) showing that, contrary to Quicksort, in Quickjoin unbalanced partitioning can improve the algorithm; and (iii) making the algorithm probabilistic while still obtaining most of the relevant results. We also show how to use Quickjoin to compute k-nearest neighbor joins. The experimental results show that the method works well in practice. © 2013 Springer-Verlag.

Cite

CITATION STYLE

APA

Fredriksson, K., & Braithwaite, B. (2013). Quicker similarity joins in metric spaces. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8199 LNCS, pp. 127–140). https://doi.org/10.1007/978-3-642-41062-8_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free