GENCCS: A correlated group difference approach to contrast set mining

Mondelle Simeon; Robert Hilderman

Conference Proceedings

GENCCS: A correlated group difference approach to contrast set mining

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6871 LNAI 140-154

DOI: 10.1007/978-3-642-23199-5_11

1Citations

4Readers

Get full text

Abstract

Contrast set mining has developed as a data mining task which aims at discerning differences amongst groups. These groups can be patients, organizations, molecules, and even time-lines, and are defined by a selected property that distinguishes one from the other. A contrast set is a conjunction of attribute-value pairs that differ significantly in their distribution across groups. The search for contrast sets can be prohibitively expensive on relatively large datasets because every combination of attribute-values must be examined, causing a potential exponential growth of the search space. In this paper, we introduce the notion of a correlated group difference (CGD) and propose a contrast set mining technique that utilizes mutual information and all confidence to select the attribute-value pairs that are most highly correlated, in order to mine CGDs. Our experiments on real datasets demonstrate the efficiency of our approach and the interestingness of the CGDs discovered. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Simeon, M., & Hilderman, R. (2011). GENCCS: A correlated group difference approach to contrast set mining. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6871 LNAI, pp. 140–154). https://doi.org/10.1007/978-3-642-23199-5_11

GENCCS: A correlated group difference approach to contrast set mining

Abstract

Cite

Register to see more suggestions