The efficiency of frequent itemset mining algorithms is determined mainly by three factors: the way candidates are generated, the data structure that is used and the implemen- tation details. Most papers focus on the first factor, some describe the underlying data structures, but implementa- tion details are almost always neglected. In this paper we show that the effect of implementation can be more impor- tant than the selection of the algorithm. Ideas that seem to be quite promising, may turn out to be ineffective if we descend to the implementation level. We theoretically and experimentally analyze APRIORI which is the most established algorithm for frequent item- set mining. Several implementations of the algorithm have been put forward in the last decade. Although they are im- plementations of the very same algorithm, they display large differences in running time and memory need. In this pa- per we describe an implementation of APRIORI that out- performs all implementations known to us. We analyze, the- oretically and experimentally, the principal data structure of our solution. This data structure is the main factor in the efficiency of our implementation. Moreover, we present a simple modification of APRIORI that appears to be faster than the original algorithm.
Mendeley saves you time finding and organizing research
There are no full text links
Choose a citation style from the tabs below