An efficient polynomial delay algorithm for pseudo frequent itemset mining

5Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Mining frequently appearing patterns in a database is a basic problem in informatics, especially in data mining. Particularly, when the input database is a collection of subsets of an itemset, the problem is called the frequent itemset mining problem, and has been extensively studied. In the real-world use, one of difficulties of frequent itemset mining is that real-world data is often incorrect, or missing some parts. It causes that some records which should include a pattern do not have it. To deal with real-world problems, one can use an ambiguous inclusion relation and find patterns which are mostly included in many records. However, computational difficulty have prevented such problems from being actively used in practice. In this paper, we use an alternative inclusion relation in which we consider an itemset P to be included in an itemset T if at most k items of P are not included in T, i.e., |P\T| ≤ k. We address the problem of enumerating frequent itemsets under this inclusion relation and propose an efficient polynomial delay polynomial space algorithm. Moreover, To enable us to skip many small non-valuable frequent itemsets, we propose an algorithm for directly enumerating frequent itemsets of a certain size. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Uno, T., & Arimura, H. (2007). An efficient polynomial delay algorithm for pseudo frequent itemset mining. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4755 LNAI, pp. 219–230). Springer Verlag. https://doi.org/10.1007/978-3-540-75488-6_21

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free