Tab-VAE: A Novel VAE for Generating Synthetic Tabular Data

0Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Variational Autoencoders (VAEs) suffer from a well-known problem of overpruning or posterior collapse due to strong regularization while working in a sufficiently high-dimensional latent space. When VAEs are used to generate tabular data, categorical one-hot encoded data expand the dimensionality of the feature space dramatically, making modeling multi-class categorical data challenging. In this paper, we propose Tab-VAE, a novel VAE-based approach to generate synthetic tabular data that tackles this challenge by introducing a sampling technique at inference for categorical variables. A detailed review of the current state-of-theart models shows that most of the tabular data generation approaches draw methodologies from Generative Adversarial Networks (GANs) while a simpler more stable VAE method is ignored. Our extensive evaluation of the Tab-VAE with other leading generative models shows Tab-VAE improves the state-of-the-art VAEs significantly. It also shows that Tab-VAE outperforms the best GAN-based tabular data generators, paving the way for a powerful and less computationally expensive tabular data generation model.

Cite

CITATION STYLE

APA

Tazwar, S. M., Knobbout, M., Quesada, E. H., & Popa, M. (2024). Tab-VAE: A Novel VAE for Generating Synthetic Tabular Data. In International Conference on Pattern Recognition Applications and Methods (Vol. 1, pp. 17–26). Science and Technology Publications, Lda. https://doi.org/10.5220/0012302400003654

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free