Missing Data (MD) is a common drawback when applying Data Mining on breast cancer datasets since it affects the ability of the Data mining classifier. This study evaluates the influence of MD on three classifiers: Decision tree C4.5, Support vector machine (SVM), and Multi-Layer Perceptron (MLP). For this purpose, 162 experiments were conducted using KNN imputation with three missingness mechanisms (MCAR, MAR and NMAR), and nine percentages (form 10% to 90%) applied on two Wisconsin breast cancer datasets. The MD percentage affects negatively the classifier performance. MLP achieved the lowest accuracy rates regardless the MD mechanism/percentage.
CITATION STYLE
Chlioui, I., Idri, A., Abnane, I., de Gea, J. M. C., & Fernández-Alemán, J. L. (2019). Breast cancer classification with missing data imputation. In Advances in Intelligent Systems and Computing (Vol. 932, pp. 13–23). Springer Verlag. https://doi.org/10.1007/978-3-030-16187-3_2
Mendeley helps you to discover research relevant for your work.