Sampling-based data mining algorithms: Modern techniques and case studies

1Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Sampling a dataset for faster analysis and looking at it as a sample from an unknown distribution are two faces of the same coin. We discuss the use of modern techniques involving the Vapnik-Chervonenkis (VC) dimension to study the trade-off between sample size and accuracy of data mining results that can be obtained from a sample. We report two case studies where we and collaborators employed these techniques to develop efficient sampling-based algorithms for the problems of betweenness centrality computation in large graphs and extracting statistically significant Frequent Itemsets from transactional datasets. © 2014 Springer-Verlag.

Cite

CITATION STYLE

APA

Riondato, M. (2014). Sampling-based data mining algorithms: Modern techniques and case studies. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8726 LNAI, pp. 516–519). Springer Verlag. https://doi.org/10.1007/978-3-662-44845-8_48

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free