Frequent itemsets are important information about databases, and efficiently mining frequent itemsets is a core problem in data mining area. The divide-and-conquer strategy is very applicable to the problem. Most algorithms adopting the strategy construct a very large number of conditional databases when mining frequent itemsets. Representations of conditional databases and methods of constructing them greatly influence the performance of such algorithms. In this study, we propose a node-set structure for representing a conditional database, and develop a novel node-set-based algorithm, NS, for mining frequent itemsets. During a mining process, all the node-sets derive from a prefix-tree storing the complete frequent itemset information about the mined database. Compared with previous conditional database representations, node-sets are compact and contiguous on which NS can be performed fast. Constructing conditional databases involves counting for items. In NS, the counting procedure and the construction procedure are blended, which saves the time for scanning conditional databases, and further, the major operations of constructing conditional databases are very simple comparisons. Experimental data show that NS outperforms several famous algorithms including FPgrowth* and LCM, ones of the fastest algorithms, for various databases. © 2012 Springer-Verlag.
CITATION STYLE
Qu, J. F., & Liu, M. (2012). Mining frequent itemsets using node-sets of a prefix-tree. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7446 LNCS, pp. 453–467). https://doi.org/10.1007/978-3-642-32600-4_34
Mendeley helps you to discover research relevant for your work.