Sampling-based data mining algorithms: Modern techniques and case studies

Matteo Riondato

Conference ProceedingsOPEN ACCESS

Sampling-based data mining algorithms: Modern techniques and case studies

Riondato M

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8726 LNAI(PART 3) 516-519

DOI: 10.1007/978-3-662-44845-8_48

1Citations

4Readers

Abstract

Sampling a dataset for faster analysis and looking at it as a sample from an unknown distribution are two faces of the same coin. We discuss the use of modern techniques involving the Vapnik-Chervonenkis (VC) dimension to study the trade-off between sample size and accuracy of data mining results that can be obtained from a sample. We report two case studies where we and collaborators employed these techniques to develop efficient sampling-based algorithms for the problems of betweenness centrality computation in large graphs and extracting statistically significant Frequent Itemsets from transactional datasets. © 2014 Springer-Verlag.

Cite

CITATION STYLE

APA

Riondato, M. (2014). Sampling-based data mining algorithms: Modern techniques and case studies. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8726 LNAI, pp. 516–519). Springer Verlag. https://doi.org/10.1007/978-3-662-44845-8_48

Sampling-based data mining algorithms: Modern techniques and case studies

Abstract

Cite

Register to see more suggestions