Most of the medical datasets suffer from missing data, due to the expense of some tests or human faults while recording these tests. This issue affects the performance of the machine learning models because the values of some features will be missing. Therefore, there is a need for a specific type of methods for imputing these missing data. In this research, the salp swarm algorithm (SSA) is used for generating and imputing the missing values in the pain in my ass (also known Pima) Indian diabetes disease (PIDD) dataset, the proposed algorithm is called (ISSA). The obtained results showed that the classification performance of three different classifiers which are support vector machine (SVM), K-nearest neighbour (KNN), and Naïve Bayesian classifier (NBC) have been enhanced as compared to the dataset before applying the proposed method. Moreover, the results indicated that issa was performed better than the statistical imputation techniques such as deleting the samples with missing values, replacing the missing values with zeros, mean, or random values.
CITATION STYLE
Hassan, G. S., Ali, N. J., Abdulsahib, A. K., Mohammed, F. J., & Gheni, H. M. (2023). A missing data imputation method based on salp swarm algorithm for diabetes disease. Bulletin of Electrical Engineering and Informatics, 12(3), 1700–1710. https://doi.org/10.11591/eei.v12i3.4528
Mendeley helps you to discover research relevant for your work.