Machine learning prediction of mortality in venous thromboembolism patients: the Birmingham Black Country Venous Thromboembolism (BBC-VTE) cohort

  • El-Bouri W
  • Sanders A
  • et al.
N/ACitations
Citations of this article
8Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Introduction: Venous thromboembolism (VTE), including deep vein thrombosis (DVT) and pulmonary embolism (PE), is one of the main causes of preventable death in hospitals in the UK. Current clinical risk scores to predict mortality of patients with VTE are the pulmonary embolism severity index (PESI) and the simplified PESI (sPESI) which have similar predictive power. Purpose: To evaluate the ability of machine learning algorithms to predict mortality in patients admitted with VTE and to compare their predictive capability with the sPESI score for 30-day mortality. Methods: The BBC-VTE was a retrospective multicentre patient cohort established to determine clinical features and novel aspects of risk prediction for VTE (and VTE-related complications) in a contemporary cohort. We include a cohort of 1554 patients (mean age 65.6 years; 53% female) who represent all consecutive admissions with a final diagnosis of VTE to one of 3 regional hospitals in the West Midlands, UK during the years 2012-2014. The dataset was split into training (70%) and validation (30%) cohorts.We trained two tree-based models, Random Forests (RF) and XGBoost (XG), using 5-fold cross-validation on the training cohort to predict patient mortality. This was validated using the held-out validation cohort and compared to a simple logistic regression model. To provide a comparison with the sPESI score, we extracted a sub-group of patients (n=652) who had values for oxygen saturation, systolic blood pressure, heart rate, history of cancer, history of cardiopulmonary disease, and age. We used RF to determine the mortality prediction using: i) only the sPESI variables listed and; ii) all the clinical variables available to us. This was then compared against the standard sPESI prediction for this cohort. C-indices (AUC) were used for comparison. Results: The c-indices for RF and XG using the full patient cohort were 0.85 [95% CI: 0.80 - 0.90] (Fig. 1a) and 0.82 [95% CI: 0.77 - 0.87], with the logistic regression c-index being 0.83 [95% CI: 0.78 - 0.88]. The reported sPESI c-index was significantly smaller (p<0.05) than the RF c-index (0.75 [95% CI: 0.69-0.80]). The most important features for prediction of mortality indicated by the RF algorithm are age, admission blood levels, discharge oral anticoagulation, and previous malignancy (Fig. 2). The sPESI score c-index for the subgroup of patients was found to be 0.72. In comparison, using RF with the same variables gives a significantly larger (p<0.05) c-index of 0.78 [95% CI: 0.73 - 0.83]. When using all clinical variables available the c-index increased to 0.85 [95% CI: 0.80 - 0.90] (Fig. 1b). Conclusion: Application of machine learning using simple clinical variables in hospital settings can improve prediction of mortality post-VTE event above-and-beyond the current simplified PESI risk score. Prospective study is warranted to validate the algorithm on external datasets and to construct individualised risk predictions.

Cite

CITATION STYLE

APA

El-Bouri, W., Sanders, A., & Lip, G. Y. H. (2021). Machine learning prediction of mortality in venous thromboembolism patients: the Birmingham Black Country Venous Thromboembolism (BBC-VTE) cohort. European Heart Journal, 42(Supplement_1). https://doi.org/10.1093/eurheartj/ehab724.3059

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free