Fast vertical mining using diffsets

475Citations
Citations of this article
87Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A number of vertical mining algorithms have been proposed recently for association mining, which have shown to be very effective and usually outperform horizontal approaches. The main advantage of the vertical format is support for fast frequency counting via intersection operations on transaction ids (tids) and automatic pruning of irrelevant data. The main problem with these approaches is when intermediate results of vertical tid lists become too large for memory, thus affecting the algorithm scalability.In this paper we present a novel vertical data representation called Diffset, that only keeps track of differences in the tids of a candidate pattern from its generating frequent patterns. We show that diffsets drastically cut down the size of memory required to store intermediate results. We show how diffsets, when incorporated into previous vertical mining methods, increase the performance significantly. Copyright 2003 ACM.

Cite

CITATION STYLE

APA

Zaki, M. J., & Gouda, K. (2003). Fast vertical mining using diffsets. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 326–335). https://doi.org/10.1145/956750.956788

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free