Distributed subgroup mining

Michael Wurst; Martin Scholz

Conference ProceedingsOPEN ACCESS

Distributed subgroup mining

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 4213 LNAI 421-433

DOI: 10.1007/11871637_40

4Citations

10Readers

Abstract

Subgroup discovery is a popular form of supervised rule learning, applicable to descriptive and predictive tasks. In this work we study two natural extensions of classical subgroup discovery to distributed settings. In the first variant the goal is to efficiently identify global subgroups, i.e. the rules an analysis would yield after collecting all the data at a single central database. In contrast, the second considered variant takes the locality of data explicitly into account. The aim is to find patterns that point out major differences between individual databases with respect to a specific property of interest (target attribute). We point out substantial differences between these novel learning problems and other kinds of distributed data mining tasks. These differences motivate new search and communication strategies, aiming at a minimization of computation time and communication costs. We present and empirically evaluate new algorithms for both considered variants. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Wurst, M., & Scholz, M. (2006). Distributed subgroup mining. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4213 LNAI, pp. 421–433). Springer Verlag. https://doi.org/10.1007/11871637_40

Distributed subgroup mining

Abstract

Cite

Register to see more suggestions