Breast cancer classification with missing data imputation

Imane Chlioui; Ali Idri; Ibtissam Abnane; Juan Manuel Carillo de Gea; Jose Luis Fernández-Alemán

Conference Proceedings

Breast cancer classification with missing data imputation

Advances in Intelligent Systems and Computing (2019) 932 13-23

DOI: 10.1007/978-3-030-16187-3_2

8Citations

18Readers

Get full text

Abstract

Missing Data (MD) is a common drawback when applying Data Mining on breast cancer datasets since it affects the ability of the Data mining classifier. This study evaluates the influence of MD on three classifiers: Decision tree C4.5, Support vector machine (SVM), and Multi-Layer Perceptron (MLP). For this purpose, 162 experiments were conducted using KNN imputation with three missingness mechanisms (MCAR, MAR and NMAR), and nine percentages (form 10% to 90%) applied on two Wisconsin breast cancer datasets. The MD percentage affects negatively the classifier performance. MLP achieved the lowest accuracy rates regardless the MD mechanism/percentage.

Author supplied keywords

Cite

CITATION STYLE

APA

Chlioui, I., Idri, A., Abnane, I., de Gea, J. M. C., & Fernández-Alemán, J. L. (2019). Breast cancer classification with missing data imputation. In Advances in Intelligent Systems and Computing (Vol. 932, pp. 13–23). Springer Verlag. https://doi.org/10.1007/978-3-030-16187-3_2

Breast cancer classification with missing data imputation

Abstract

Author supplied keywords

Cite

Register to see more suggestions