Applying probability calibration to ensemble methods to predict 2-year mortality in patients with DLBCL

Shuanglong Fan; Zhiqiang Zhao; Hongmei Yu; Lei Wang; Chuchu Zheng; Xueqian Huang; Zhenhuan Yang; Meng Xing; Qing Lu; Yanhong Luo

Journal ArticleOPEN ACCESS

Applying probability calibration to ensemble methods to predict 2-year mortality in patients with DLBCL

BMC Medical Informatics and Decision Making (2021) 21(1)

DOI: 10.1186/s12911-020-01354-0

9Citations

17Readers

Abstract

Background: Under the influences of chemotherapy regimens, clinical staging, immunologic expressions and other factors, the survival rates of patients with diffuse large B-cell lymphoma (DLBCL) are different. The accurate prediction of mortality hazards is key to precision medicine, which can help clinicians make optimal therapeutic decisions to extend the survival times of individual patients with DLBCL. Thus, we have developed a predictive model to predict the mortality hazard of DLBCL patients within 2 years of treatment. Methods: We evaluated 406 patients with DLBCL and collected 17 variables from each patient. The predictive variables were selected by the Cox model, the logistic model and the random forest algorithm. Five classifiers were chosen as the base models for ensemble learning: the naïve Bayes, logistic regression, random forest, support vector machine and feedforward neural network models. We first calibrated the biased outputs from the five base models by using probability calibration methods (including shape-restricted polynomial regression, Platt scaling and isotonic regression). Then, we aggregated the outputs from the various base models to predict the 2-year mortality of DLBCL patients by using three strategies (stacking, simple averaging and weighted averaging). Finally, we assessed model performance over 300 hold-out tests. Results: Gender, stage, IPI, KPS and rituximab were significant factors for predicting the deaths of DLBCL patients within 2 years of treatment. The stacking model that first calibrated the base model by shape-restricted polynomial regression performed best (AUC = 0.820, ECE = 8.983, MCE = 21.265) in all methods. In contrast, the performance of the stacking model without undergoing probability calibration is inferior (AUC = 0.806, ECE = 9.866, MCE = 24.850). In the simple averaging model and weighted averaging model, the prediction error of the ensemble model also decreased with probability calibration. Conclusions: Among all the methods compared, the proposed model has the lowest prediction error when predicting the 2-year mortality of DLBCL patients. These promising results may indicate that our modeling strategy of applying probability calibration to ensemble learning is successful.

Author supplied keywords

Cite

CITATION STYLE

APA

Fan, S., Zhao, Z., Yu, H., Wang, L., Zheng, C., Huang, X., … Luo, Y. (2021). Applying probability calibration to ensemble methods to predict 2-year mortality in patients with DLBCL. BMC Medical Informatics and Decision Making, 21(1). https://doi.org/10.1186/s12911-020-01354-0

Applying probability calibration to ensemble methods to predict 2-year mortality in patients with DLBCL

Abstract

Author supplied keywords

Cite

Register to see more suggestions