New Method Based Pre-Processing to Tackle Missing and High Dimensional Data of CRISP-DM Approach

0Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The kidneys are one of the most important organs including the excretion system in humans. The kidneys are responsible for maintaining blood concentrations to remain constant (homeostatic) and help to control blood pressure (BP). If the task of the kidney is not functioning properly it will cause kidney failure. In the past decade, data mining methods have been used to diagnose kidney failure. The dataset used to predict kidney failure was successfully summarized by Soundarapandian, and was named the Chronic Kidney Disease (CKD) dataset. But the data in the CKD dataset contains missing value and high dimension data (original data) so that it affects the evaluation results on classification. This research proposes methods in preprocessing data, namely modus in every class (MEC) method to solve missing value problems, and the weight information gain (WIG) method for solving high dimensional data problems, the proposed method is named the MEC + WIG method. The MEC + WIG method will be compared with the original method and the MEC method and evaluated based on the accuracy of the traditional classification method (k-NN, Naïve Bayes, C4.5, and CART). The results showed that the average accuracy of the MEC + WIG method was better than the original method and the MEC method, with the average accuracy of the MEC + WIG method at 98.13%, while the average value of the accuracy of the original method and MEC respectively amounting to 88.56% and 92.88%. There were significant differences between the three methods when tested using Friedman test with a p-value of 0.02. It can be concluded that the MEC + WIG method can improve the performance of traditional methods k-NN, Naive Bayes, C4.5 and CART by overcoming the problem of missing value and data high dimension.

Cite

CITATION STYLE

APA

Suntoro, J., Ilham, A., & Rani, H. A. D. (2020). New Method Based Pre-Processing to Tackle Missing and High Dimensional Data of CRISP-DM Approach. In Journal of Physics: Conference Series (Vol. 1471). Institute of Physics Publishing. https://doi.org/10.1088/1742-6596/1471/1/012012

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free