TopPI: An efficient algorithm for item-centric mining

Martin Kirchgessner; Vincent Leroy; Alexandre Termier; Sihem Amer-Yahia; Marie Christine Rousset

Conference Proceedings

TopPI: An efficient algorithm for item-centric mining

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9829 LNCS 19-33

DOI: 10.1007/978-3-319-43946-4_2

1Citations

2Readers

Get full text

Abstract

We introduce TopPI, a new semantics and algorithm designed to mine long-tailed datasets. For each item, and regardless of its frequency, TopPI finds the k most frequent closed itemsets that item belongs to. For example, in our retail dataset, TopPI finds the itemset “nori seaweed, wasabi, sushi rice, soy sauce” that occurrs in only 133 store receipts out of 290 million. It also finds the itemset “milk, puff pastry”, that appears 152,991 times. Thanks to a dynamic threshold adjustment and an adequate pruning strategy, TopPI efficiently traverses the relevant parts of the search space and can be parallelized on multi-cores. Our experiments on datasets with different characteristics show the high performance of TopPI and its superiority when compared to state-of-the-art mining algorithms. We show experimentally on real datasets that TopPI allows the analyst to explore and discover valuable itemsets.

Author supplied keywords

Cite

CITATION STYLE

APA

Kirchgessner, M., Leroy, V., Termier, A., Amer-Yahia, S., & Rousset, M. C. (2016). TopPI: An efficient algorithm for item-centric mining. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9829 LNCS, pp. 19–33). Springer Verlag. https://doi.org/10.1007/978-3-319-43946-4_2

TopPI: An efficient algorithm for item-centric mining

Abstract

Author supplied keywords

Cite

Register to see more suggestions