A Comparative Study of the Use of Coresets for Clustering Large Datasets

Nguyen Le Hoang; Tran Khanh Dang; Le Hong Trang

Conference Proceedings

A Comparative Study of the Use of Coresets for Clustering Large Datasets

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11814 LNCS 45-55

DOI: 10.1007/978-3-030-35653-8_4

5Citations

3Readers

Get full text

Abstract

Coresets can be described as a compact subset such that models trained on coresets will also provide a good fit with models trained on full data set. By using coresets, we can scale down a big data to a tiny one in order to reduce the computational cost of a machine learning problem. In recent years, data scientists have investigated various methods to create coresets. The two state-of-the-art algorithms have been proposed in 2018 are ProTraS by Ros & Guillaume and Lightweight Coreset by Bachem et al. In this paper, we briefly introduce these two algorithms and make a comparison between them to find out the benefits and drawbacks of each one.

Author supplied keywords

Cite

CITATION STYLE

APA

Hoang, N. L., Dang, T. K., & Trang, L. H. (2019). A Comparative Study of the Use of Coresets for Clustering Large Datasets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11814 LNCS, pp. 45–55). Springer. https://doi.org/10.1007/978-3-030-35653-8_4

A Comparative Study of the Use of Coresets for Clustering Large Datasets

Abstract

Author supplied keywords

Cite

Register to see more suggestions