Optimal Subgroup Discovery in Purely Numerical Data

Alexandre Millot; Rémy Cazabet; Jean François Boulicaut

Conference ProceedingsOPEN ACCESS

Optimal Subgroup Discovery in Purely Numerical Data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12085 LNAI 112-124

DOI: 10.1007/978-3-030-47436-2_9

4Citations

8Readers

Abstract

Subgroup discovery in labeled data is the task of discovering patterns in the description space of objects to find subsets of objects whose labels show an interesting distribution, for example the disproportionate representation of a label value. Discovering interesting subgroups in purely numerical data - attributes and target label - has received little attention so far. Existing methods make use of discretization methods that lead to a loss of information and suboptimal results. This is the case for the reference algorithm SD-Map*. We consider here the discovery of optimal subgroups according to an interestingness measure in purely numerical data. We leverage the concept of closed interval patterns and advanced enumeration and pruning techniques. The performances of our algorithm are studied empirically and its added-value w.r.t. SD-Map* is illustrated.

Author supplied keywords

Cite

CITATION STYLE

APA

Millot, A., Cazabet, R., & Boulicaut, J. F. (2020). Optimal Subgroup Discovery in Purely Numerical Data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12085 LNAI, pp. 112–124). Springer. https://doi.org/10.1007/978-3-030-47436-2_9

Optimal Subgroup Discovery in Purely Numerical Data

Abstract

Author supplied keywords

Cite

Register to see more suggestions