GLSTM: A novel approach for prediction of real & synthetic PID diabetes data using GANs and LSTM classification model

8Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Generative Adversarial Network (GAN) is a revolution in modern artificial systems. Deep learning-based Generative adversarial networks generate realistic synthetic tabular data. Synthetic data are used to enhance the size of a relatively small training dataset while ensuring the confidentiality of the original data. In this context, we implemented the GAN framework for generating diabetes data to help the health care professional in more clinical applications. GAN is used to validate the Pima Indian Diabetes (PID) Dataset. Various preprocessing techniques, such as handling missing values, outliers and data imbalance problems, enhance data quality. Some exploratory data analyses, such as heat maps, bar graphs and histograms, are used for data visualisation. We employed hypothesis testing to examine the resemblance between real data and GAN-generated synthetic data. In this study, we proposed a GAN-Long Short-Term Memory (GLSTM) system, in which GAN is used for data augmentation, and LSTM is used for diabetes classification. Additionally, various GAN models such as CTGAN, Vanilla GAN, Coupula GAN, Gaussian Coupula GAN, and TVAE GAN are used to generate the synthetic dataset. Experiments were conducted on real data, synthetic data, and by combining real and synthetic data. The model that used both real and synthetic data obtained a substantially better accuracy of 97% compared to 92% when only real data was used. We also observed that synthetic data could be used in place of real data, as the mean correlation between synthetic and real data is 0.93. Our study's findings outperformed when compared to state-of-the-art methodologies.

Author supplied keywords

References Powered by Scopus

Generative adversarial networks

9147Citations
N/AReaders
Get full text

Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas, 9<sup>th</sup> edition

7210Citations
N/AReaders
Get full text

An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier

340Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Can I trust my fake data – A comprehensive quality assessment framework for synthetic tabular data in healthcare

10Citations
N/AReaders
Get full text

Role of AI for smart health diagnosis and treatment

5Citations
N/AReaders
Get full text

Application of Genetic Algorithms for Medical Diagnosis of Diabetes Mellitus

5Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Jaiswal, S., & Gupta, P. (2023). GLSTM: A novel approach for prediction of real & synthetic PID diabetes data using GANs and LSTM classification model. International Journal of Experimental Research and Review, 30, 32–45. https://doi.org/10.52756/ijerr.2023.v30.004

Readers over time

‘23‘24‘2502468

Readers' Seniority

Tooltip

Lecturer / Post doc 3

60%

Professor / Associate Prof. 1

20%

PhD / Post grad / Masters / Doc 1

20%

Readers' Discipline

Tooltip

Computer Science 3

75%

Business, Management and Accounting 1

25%

Save time finding and organizing research with Mendeley

Sign up for free
0