Semantic similarity group by operators for metric data

Natan A. Laverde; Mirela T. Cazzolato; Agma J.M. Traina; Caetano Traina

Conference Proceedings

Semantic similarity group by operators for metric data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10609 LNCS 247-261

DOI: 10.1007/978-3-319-68474-1_17

2Citations

4Readers

Get full text

Abstract

Grouping operators summarize data in DBMS arranging elements in groups using identity comparisons. However, for metric data, grouping by identity is seldom useful, since adopting the concept of similarity is often a better fit. There are operators that can group data elements using similarity. However, the existing operators do not achieve good results for certain data domains or distributions. The major contributions of this work are a novel operator called the SGB-Vote that assign groups using an election involving already assigned groups and an extension for current operators bounds each group to a maximum amount of the nearest neighbors. The operators were implemented in a framework and evaluated using real and synthetic datasets from diverse domains considering both quality of and execution time. The results obtained show that the proposed operators produce higher quality groups in all tested datasets and highlight that the operators can efficiently run inside a DBMS.

Author supplied keywords

Cite

CITATION STYLE

APA

Laverde, N. A., Cazzolato, M. T., Traina, A. J. M., & Traina, C. (2017). Semantic similarity group by operators for metric data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10609 LNCS, pp. 247–261). Springer Verlag. https://doi.org/10.1007/978-3-319-68474-1_17

Semantic similarity group by operators for metric data

Abstract

Author supplied keywords

Cite

Register to see more suggestions