Improved methods for extracting frequent itemsets from interim-support trees

Frans Coenen; Paul Leng; Aris Pagourtzis; Wojciech Rytter; Dora Souliou

Conference Proceedings

Improved methods for extracting frequent itemsets from interim-support trees

Research and Development in Intelligent Systems XXII - Proceedings of AI 2005, the 25th SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence (2006) 263-276

DOI: 10.1007/978-1-84628-226-3_20

0Citations

3Readers

Get full text

Abstract

Mining association rules in relational databases is a significant computational task with lots of applications. A fundamental ingredient of this task is the discovery of sets of attributes (itemsets) whose frequency in the data exceeds some threshold value. In previous work [9] we have introduced an approach to this problem which begins by carrying out an efficient partial computation of the necessary totals, storing these interim results in a set-enumeration tree. This work demonstrated that making use of this structure can significantly reduce the cost of determining the frequent sets. In this paper we describe two algorithms for completing the calculation of frequent sets using an interim-support tree. These algorithms are improved versions of earlier algorithms described in the above mentioned work and in a consequent paper [7]. The first of our new algorithms (TTF) differs from its ancestor in that it uses a novel tree pruning technique, based on the notion of (fixed-prefix) potential inclusion, which is specially designed for trees that are implemented using only two pointers per node. This allows to implement the interim-support tree in a space efficient manner. The second algorithm (PTF) explores the idea of storing the frequent itemsets in a second tree structure, called the total support tree (T-tree); the improvement lies in the use of multiple pointers per node which provides rapid access to the nodes of the T-tree and makes it possible to design a new, usually faster, method for updating them. Experimental comparison shows that these improvements result in considerable speedup for both algorithms. Further comparison between the two improved algorithms, shows that PTF is generally faster on instances with a large number of frequent itemsets, while TTF is more appropriate whenever this number is small; in addition, TTF behaves quite well on instances in which the densities of the items of the database have a high variance. © 2006 Springer-Verlag London.

Cite

CITATION STYLE

APA

Coenen, F., Leng, P., Pagourtzis, A., Rytter, W., & Souliou, D. (2006). Improved methods for extracting frequent itemsets from interim-support trees. In Research and Development in Intelligent Systems XXII - Proceedings of AI 2005, the 25th SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence (pp. 263–276). Springer London. https://doi.org/10.1007/978-1-84628-226-3_20

Improved methods for extracting frequent itemsets from interim-support trees

Abstract

Cite

Register to see more suggestions