Semantic similarity group by operators for metric data

2Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Grouping operators summarize data in DBMS arranging elements in groups using identity comparisons. However, for metric data, grouping by identity is seldom useful, since adopting the concept of similarity is often a better fit. There are operators that can group data elements using similarity. However, the existing operators do not achieve good results for certain data domains or distributions. The major contributions of this work are a novel operator called the SGB-Vote that assign groups using an election involving already assigned groups and an extension for current operators bounds each group to a maximum amount of the nearest neighbors. The operators were implemented in a framework and evaluated using real and synthetic datasets from diverse domains considering both quality of and execution time. The results obtained show that the proposed operators produce higher quality groups in all tested datasets and highlight that the operators can efficiently run inside a DBMS.

Cite

CITATION STYLE

APA

Laverde, N. A., Cazzolato, M. T., Traina, A. J. M., & Traina, C. (2017). Semantic similarity group by operators for metric data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10609 LNCS, pp. 247–261). Springer Verlag. https://doi.org/10.1007/978-3-319-68474-1_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free