Mining classification rules from datasets with large number of many-valued attributes

Giovanni Giuffrida; Wesley W. Chu; Dominique M. Hanssens

Conference Proceedings

Mining classification rules from datasets with large number of many-valued attributes

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2000) 1777 335-349

DOI: 10.1007/3-540-46439-5_23

16Citations

22Readers

Get full text

Abstract

Decision tree induction algorithms scale well to large datasets for their univariate and divide-and-conquer approach. However, they may fail in discovering effective knowledge when the input dataset consists of a large number of uncorrelated many-valued attributes. In this paper we present an algorithm, Noah, that tackles this problem by applying a multivariate search. Performing a multivariate search leads to a much larger consumption of computation time and memory, this may be prohibitive for large datasets. We remedy this problem by exploiting effective pruning strategies and efficient data structures. We applied our algorithm to a real marketing application of cross-selling. Experimental results revealed that the application database was too complex for C4.5 as it failed to discover any useful knowledge. The application database was also too large for various well known rule discovery algorithms which were not able to complete their task. The pruning techniques used in Noah are general in nature and can be used in other mining systems.

Cite

CITATION STYLE

APA

Giuffrida, G., Chu, W. W., & Hanssens, D. M. (2000). Mining classification rules from datasets with large number of many-valued attributes. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1777, pp. 335–349). Springer Verlag. https://doi.org/10.1007/3-540-46439-5_23

Mining classification rules from datasets with large number of many-valued attributes

Abstract

Cite

Register to see more suggestions