Effective sampling for mining association rules

Yanrong Li; Raj P. Gopalan

Conference Proceedings

Effective sampling for mining association rules

Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (2004) 3339 391-401

DOI: 10.1007/978-3-540-30549-1_35

22Citations

14Readers

Get full text

Abstract

As discovering association rules in a very large database is time consuming, researchers have developed many algorithms to improve the efficiency. Sampling can significantly reduce the cost of mining, since the mining algorithms need to deal with only a small dataset compared to the original database. Especially, if data comes as a stream flowing at a faster rate than can be processed, sampling seems to be the only choice. How to sample the data and how big the sample size should be for a given error bound and confidence level are key issues for particular data mining tasks. In this paper, we derive the sufficient sample size based on central limit theorem for sampling large datasets with replacement. This approach requires smaller sample size than that based on the Chernoff bounds and is effective for association rules mining. The effectiveness of the method has been evaluated on both dense and sparse datasets. © Springer-Verlag Berlin Heidelberg 2004.

Cite

CITATION STYLE

APA

Li, Y., & Gopalan, R. P. (2004). Effective sampling for mining association rules. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3339, pp. 391–401). Springer Verlag. https://doi.org/10.1007/978-3-540-30549-1_35

Effective sampling for mining association rules

Abstract

Cite

Register to see more suggestions