Assessment of Factors Influencing the Survival of Breast Cancer Patients using a Machine Learning Approach

  • Motarwar S
  • et al.
N/ACitations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Breast cancer is one of the deadliest diseases, claiming approximately 627,000 lives worldwide in 2018–2019. Therefore, early detection of breast cancer through automation in the prediction of the disease will help the medical industry to cure this disease at an early stage and thereby reduce the risk of death drastically. In the present study, the Breast Cancer Wisconsin (Diagnostic) Data Set has been taken from the University of California Irvine (UCI) Machine Learning Repository. The dataset (n=699) contained a total of 30 predictor parameters and one dependent parameter. The dependent variable referred to the type of cancer tissue, i.e., benign or malignant. To predict the type of cancer tissue present in the patient, prediction models were built using 1) Logistic Regression (LR), 2) Decision Tree Classifier (DTC), 3) Random Forest Classifier (RFC), 4) K Nearest Neighbor (KNN), 5) Support Vector Machine (SVM), and 6) Ada Boost Classifier (ABC). To improve the accuracy of the model, a correlation matrix was used and the top 8 features were selected. To improve the accuracy even further, the Synthetic Minority Oversampling Technique (SMOTE) was used to eliminate the problem of class imbalance, and then accuracy was compared before and after SMOTE. The Precision, Recall, and F1 scores are the performance metrics that have been taken into consideration for selecting the best model for the analysis. The results of the study reveal that the KNN algorithm gives the highest accuracy of 95.321% after the SMOTE technique is applied to each of the six algorithms. It has been revealed that while SMOTE aids in the accuracy of some algorithms, it affects the performance of others. This research may be turned into realistic tools that can be utilized in the medical field to more accurately predict the stage of disease for better treatment management.

Cite

CITATION STYLE

APA

Motarwar, S., & Jha, D. K. (2022). Assessment of Factors Influencing the Survival of Breast Cancer Patients using a Machine Learning Approach. International Journal of Innovative Technology and Exploring Engineering, 11(3), 80–84. https://doi.org/10.35940/ijitee.c9713.0111322

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free