Differentiable Pattern Set Mining

9Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Pattern set mining has been successful in discovering small sets of highly informative and useful patterns from data. To find good models, existing methods heuristically explore the twice-exponential search space over all possible pattern sets in a combinatorial way, by which they are limited to data over at most hundreds of features, as well as likely to get stuck in local minima. Here, we propose a gradient based optimization approach that allows us to efficiently discover high-quality pattern sets from data of millions of rows and hundreds of thousands of features. In particular, we propose a novel type of neural autoencoder called BinaPs, using binary activations and binarizing weights in each forward pass, which are directly interpretable as conjunctive patterns. For training, optimizing a data-sparsity aware reconstruction loss, continuous versions of the weights are learned in small, noisy steps. This formulation provides a link between the discrete search space and continuous optimization, thus allowing for a gradient based strategy to discover sets of high-quality and noise-robust patterns. Through extensive experiments on both synthetic and real world data, we show that BinaPs discovers high quality and noise robust patterns, and unique among all competitors, easily scales to data of supermarket transactions or biological variant calls.

Cite

CITATION STYLE

APA

Fischer, J., & Vreeken, J. (2021). Differentiable Pattern Set Mining. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 383–392). Association for Computing Machinery. https://doi.org/10.1145/3447548.3467348

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free