Diversity in similarity joins

Lucio F.D. Santos; Luiz Olmes Carvalho; Willian D. Oliveira; Agma J.M. Traina; Caetano Traina

Conference Proceedings

Diversity in similarity joins

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9371 42-53

DOI: 10.1007/978-3-319-25087-8_4

2Citations

3Readers

Get full text

Abstract

With the increasing ability of current applications to produce and consume more complex data, such as images and geographic information, the similarity join has attracted considerable attention. However, this operator does not consider the relationship among the elements in the answer, generating results with many pairs similar among themselves, which does not add value to the final answer. Result diversification methods are intended to retrieve elements similar enough to satisfy the similarity conditions, but also considering the diversity among the elements in the answer, producing a more heterogeneous result with smaller cardinality, which improves the meaning of the answer. Still, diversity have been studied only when applied to unary operations. In this paper, we introduce the concept of diverse similarity joins: a similarity join operator that ensures a smaller, more diversified and useful answers. The experiments performed on real and synthetic datasets show that our proposal allows exploiting diversity in similarity joins without diminish their performance whereas providing elements that cover the same data space distribution of the non-diverse answers.

Author supplied keywords

Cite

CITATION STYLE

APA

Santos, L. F. D., Carvalho, L. O., Oliveira, W. D., Traina, A. J. M., & Traina, C. (2015). Diversity in similarity joins. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9371, pp. 42–53). Springer Verlag. https://doi.org/10.1007/978-3-319-25087-8_4

Diversity in similarity joins

Abstract

Author supplied keywords

Cite

Register to see more suggestions