SQLEM: Fast Clustering in SQL using the EM Algorithm

Carlos Ordonez; Paul Cereghini

Conference ProceedingsOPEN ACCESS

SQLEM: Fast Clustering in SQL using the EM Algorithm

SIGMOD 2000 - Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (2000) 559-570

DOI: 10.1145/342009.335468

1Citations

14Readers

Get full text

Abstract

Clustering is one of the most important tasks performed in Data Mining applications. This paper presents an efficient SQL implementation of the EM algorithm to perform clustering in very large databases. Our version can effectively handle high dimensional data, a high number of clusters and more importantly, a very large number of data records. We present three strategies to implement EM in SQL: horizontal, vertical and a hybrid one. We expect this work to be useful for data mining programmers and users who want to cluster large data sets inside a relational DBMS.

Cite

CITATION STYLE

APA

Ordonez, C., & Cereghini, P. (2000). SQLEM: Fast Clustering in SQL using the EM Algorithm. In SIGMOD 2000 - Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data (pp. 559–570). Association for Computing Machinery, Inc. https://doi.org/10.1145/342009.335468

SQLEM: Fast Clustering in SQL using the EM Algorithm

Abstract

Cite

Register to see more suggestions