PRINCIPAL COMPONENT ANALYSIS IMPLEMENTATION ON MACHINE LEARNING IN DIABETES CLASSIFICATION

Michael Tantowen; Krisna Putra; Mahmud Isnan; Bens Pardamean

Journal ArticleOPEN ACCESS

PRINCIPAL COMPONENT ANALYSIS IMPLEMENTATION ON MACHINE LEARNING IN DIABETES CLASSIFICATION

Communications in Mathematical Biology and Neuroscience (2024) 2024

DOI: 10.28919/cmbn/8492

1Citations

15Readers

Get full text

Abstract

Diabetes Mellitus, a global health burden linked to increased cancer risks, can be identified through variables like BMI, age, blood sugar, and HbA1c. This study explored diverse machine learning techniques for diabetes prediction, emphasizing dimensionality reduction and feature selection's role in enhancing model accuracy. Our motive is to compare the performance of multiple machine learning algorithms measures between original data and original data on which the handling sampling method or principal component analysis (PCA) was applied. The study utilizes Kaggle's "Diabetes Prediction Dataset" with 100,000 entries, employing eight features and one target variable related to diabetes. In the experiment, the dataset was divided into three distinct datasets: 1) whole dataset, 2) dataset containing males only, and 3) dataset containing females only. Those datasets were trained with multiple machine learning models: K-Nearest Neighbor (KNN), Decision Tree (DT), Support Vector Machines (SVM), XGBoost (XGB), and Random Forest (RF). The findings revealed that XGB outperformed other models with f1-score of 80.87 for an imbalanced dataset. Moreover, in diabetes classification based on gender, the random forest model was better for males with 80.34 as the f1-score while XGB was good for females 81.9 as the f1-score.

Author supplied keywords

Cite

CITATION STYLE

APA

Tantowen, M., Putra, K., Isnan, M., & Pardamean, B. (2024). PRINCIPAL COMPONENT ANALYSIS IMPLEMENTATION ON MACHINE LEARNING IN DIABETES CLASSIFICATION. Communications in Mathematical Biology and Neuroscience, 2024. https://doi.org/10.28919/cmbn/8492

PRINCIPAL COMPONENT ANALYSIS IMPLEMENTATION ON MACHINE LEARNING IN DIABETES CLASSIFICATION

Abstract

Author supplied keywords

Cite

Register to see more suggestions