More efficient algorithms for mining high-utility itemsets with multiple minimum utility thresholds

Wensheng Gan; Jerry Chun Wei Lin; Philippe Fournier Viger; Han Chieh Chao

Conference Proceedings

More efficient algorithms for mining high-utility itemsets with multiple minimum utility thresholds

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9827 LNCS 71-87

DOI: 10.1007/978-3-319-44403-1_5

14Citations

12Readers

Get full text

Abstract

Mining high-utility itemsets (HUIs) is a popular data mining task, which consists of discovering sets of items that yield a high profit in a transaction database. Although HUI mining has numerous applications, a key limitation is that a single minimum utility threshold (minutil) is used to assess the utility of all items. This simplifying assumption is unrealistic since in real-life all items do not have the same unit profit, and thus do not have an equal chance of generating a high profit. As a result, if the minutil threshold is set high, patterns containing items having a low unit profit are often missed, while if minutil is set low, the number of patterns becomes unmanageable. To address this issue, this paper presents an efficient tree-based algorithm named HIMU for mining HUIs using multiple minimum utility thresholds. A novel tree structure called multiple item utility Set-enumeration (MIU)-tree and the global and conditional downward closure (GDC and CDC) properties of HUIs in the MIU-tree are proposed. Moreover, a vertical compact utility-list structure is adopted to store the information required for discovering HUIs without performing additional database scans and generating candidates. An extensive experimental study on real-world and synthetic datasets show that this greatly improves the efficiency of the algorithm in terms of runtime and scalability.

Author supplied keywords

Cite

CITATION STYLE

APA

Gan, W., Lin, J. C. W., Viger, P. F., & Chao, H. C. (2016). More efficient algorithms for mining high-utility itemsets with multiple minimum utility thresholds. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9827 LNCS, pp. 71–87). Springer Verlag. https://doi.org/10.1007/978-3-319-44403-1_5

More efficient algorithms for mining high-utility itemsets with multiple minimum utility thresholds

Abstract

Author supplied keywords

Cite

Register to see more suggestions