Breast Cancer Prediction Based on Differential Privacy and Logistic Regression Optimization Model

1Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

In order to improve the classification effect of the logistic regression (LR) model for breast cancer prediction, a new hybrid feature selection method is proposed to process the data, using the Pearson correlation test and the iterative random forest algorithm based on out-of-bag estimation (RF-OOB) to screen the optimal 17 features as inputs to the model. Secondly, the LR is optimized using the batch gradient descent (BGD-LR) algorithm to train the loss function of the model to minimize the loss. In order to protect the privacy of breast cancer patients, a differential privacy protection technology is added to the BGD-LR model, and an LR optimization model based on differential privacy with batch gradient descent (BDP-LR) is constructed. Finally, experiments are carried out on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset. Meanwhile, accuracy, precision, recall, and F1-score are selected as the four main evaluation indicators. Moreover, the hyperparameters of each model are determined by the grid search method and the cross-validation method. The experimental results show that after hybrid feature selection, the optimal results of the four main evaluation indicators of the BGD-LR model are 0.9912, 1, 0.9886, and 0.9943, in which the accuracy, recall, and F1-scores are increased by 2.63%, 3.41%, and 1.76%, respectively. For the BDP-LR model, when the privacy budget ε is taken as 0.8, the classification performance and privacy protection effect of the model reach an effective balance. At the same time, the four main evaluation indicators of the model are 0.9721, 0.9975, 0.9664, and 0.9816, which are improved by 1.58%, 0.26%, 1.81%, and 1.07%, respectively. Comparative analysis shows that the models of BGD-LR and BDP-LR constructed in this paper perform better than other classification models.

Cite

CITATION STYLE

APA

Chen, H., Wang, N., Zhou, Y., Mei, K., Tang, M., & Cai, G. (2023). Breast Cancer Prediction Based on Differential Privacy and Logistic Regression Optimization Model. Applied Sciences (Switzerland), 13(19). https://doi.org/10.3390/app131910755

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free