Abstract
Synthetic data generation research has been progressing at a rapid pace and novel methods are being designed every now and then. Earlier, statistical methods were used to learn the distributions of real data and then sample synthetic data from those distributions. Recent advances in generative models have led to more efficient modeling of complex high-dimensional datasets. Also, privacy concerns have led to the development of robust models with lesser risk of privacy breaches. Firstly, the paper presents a comprehensive survey of existing techniques for tabular data generation and evaluation matrices. Secondly, it elaborates on a comparative analysis of state-of- the-art synthetic data generation techniques, specifically CTGAN and TVAE for small, medium, and large-scale datasets with varying data distributions. It further evaluates the synthetic data using quantitative and qualitative metrics/techniques. Finally, this paper presents the outcomes and also highlights the issues and shortcomings which are still need to be addressed.
Author supplied keywords
Cite
CITATION STYLE
Yadav, P., Gaur, M., Madhukar, R. K., Verma, G., Kumar, P., Fatima, N., … Dwivedi, Y. R. (2024). Rigorous Experimental Analysis of Tabular Data Generated using TVAE and CTGAN. International Journal of Advanced Computer Science and Applications, 15(4), 1250–1262. https://doi.org/10.14569/IJACSA.2024.01504125
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.