A Comparison of Machine Learning Algorithms to Predict Cervical Cancer on Imbalanced Data

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Cervical cancer is a leading cause of death in women. The present research analyzes, explores, compares and identifies the best method for predicting cervical cancer by applying machine learning techniques. The data is from the University Hospital of Caracas, Venezuela where a selection of variables was made according to the literature in order to predict cervical cancer. Seven algorithms were applied: decision tree (DT), random forest (RF), logistic regression (LR), XGBoost (XG), Naive Bayes (NB), multilayer perceptron (MLP) and K-nearest neighbors (KNN). Furthermore, three imbalanced data techniques were applied: SMOTETomek, SMOTE, and ROS for Hinselmann, Schiller, Cytology and Biopsy as target variables. In addition, accuracy, precision, recall, f-score and AUC were used to evaluate the results. Random forest was the algorithm with the highest results in accuracy, precision and f-score, with 94.57%, 72.46% and 60.70% respectively. Logistic regression and Naive Bayes had the highest values for recall and AUC with 68.37% and 79.11% respectively.

Cite

CITATION STYLE

APA

Ortiz-Torres, C., Reátegui, R., Valdiviezo-Diaz, P., & Barba-Guaman, L. (2023). A Comparison of Machine Learning Algorithms to Predict Cervical Cancer on Imbalanced Data. In Communications in Computer and Information Science (Vol. 1755 CCIS, pp. 118–129). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-24985-3_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free