Breast Cancer is the most common type of cancer in women worldwide. In spite of this fact, there are insufficient studies that, using data mining techniques, are capable of helping medical doctors in their daily practice. This paper presents a comparative study of three ensemble methods (TreeBagger, LPBoost and Subspace) using a clinical dataset with 25% missing values to predict the overall survival of women with breast cancer. To complete the absent values, the k-nearest neighbor (k-NN) algorithm was used with four distinct neighbor values, trying to determine the best one for this particular scenario. Tests were performed for each of the three ensemble methods and each k-NN configuration, and their performance compared using a Friedman test. Despite the complexity of this challenge, the produced results are promising and the best algorithmconfiguration (TreeBagger using 3 neighbors) presents a prediction accuracy of 73%. © Springer International Publishing Switzerland 2014.
CITATION STYLE
Abreu, P. H., Amaro, H., Silva, D. C., Machado, P., Abreu, M. H., Afonso, N., & Dourado, A. (2014). Overall survival prediction for women breast cancer using ensemble methods and incomplete clinical data. In IFMBE Proceedings (Vol. 41, pp. 1366–1369). Springer Verlag. https://doi.org/10.1007/978-3-319-00846-2_338
Mendeley helps you to discover research relevant for your work.