Similarity joins are troublesome database operators that often produce results much larger than the user really needs or expects. In order to return the similar elements, similarity joins also require sorting during the retrieval process, although order is a concept not supported in the relational model. This paper proposes a solution to solve those two issues extending the similarity join concept to a broader set of binary operators, which aims at retrieving the most similar pairs and embedding the sorting operation only as an internal processing step, so as to comply with the relational theory. Additionally, our extension allows to explore another useful condition not previously considered in the similarity retrieval: the negation of predicates. Experiments performed on real and synthetic data show that our operators are fast enough to be used in real applications and scale well both for multidimensional and non-dimensional metric data.
CITATION STYLE
Carvalho, L. O., Santos, L. F. D., Oliveira, W. D., Traina, A. J. M., & Traina, C. (2015). Similarity joins and beyond: An extended set of binary operators with order. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9371, pp. 29–41). Springer Verlag. https://doi.org/10.1007/978-3-319-25087-8_3
Mendeley helps you to discover research relevant for your work.