FinDiff: Diffusion Models for Financial Tabular Data Generation

33Citations
Citations of this article
42Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The sharing of microdata, such as fund holdings and derivative instruments, by regulatory institutions presents a unique challenge due to strict data confidentiality and privacy regulations. These challenges often hinder the ability of both academics and practitioners to conduct collaborative research effectively. The emergence of generative models, particularly diffusion models, capable of synthesizing data mimicking the underlying distributions of real-world data presents a compelling solution. This work introduces Financial Tabular Diffusion (FinDiff), a diffusion model designed to generate real-world mixed-type financial tabular data for a variety of downstream tasks, for example, economic scenario modeling, stress tests, and fraud detection. The model uses embedding encodings to model mixed modality financial data, comprising both categorical and numeric attributes. The performance of FinDiff in generating synthetic tabular financial data is evaluated against state-of-the-art baseline models using three real-world financial datasets (including two publicly available datasets and one proprietary dataset). Empirical results demonstrate that FinDiff excels in generating synthetic tabular financial data with high fidelity, privacy, and utility.

Cite

CITATION STYLE

APA

Sattarov, T., Schreyer, M., & Borth, D. (2023). FinDiff: Diffusion Models for Financial Tabular Data Generation. In ICAIF 2023 - 4th ACM International Conference on AI in Finance (pp. 64–72). Association for Computing Machinery, Inc. https://doi.org/10.1145/3604237.3626876

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free