Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening

57Citations
Citations of this article
87Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

As patient health information is highly regulated due to privacy concerns, most machine learning (ML)-based healthcare studies are unable to test on external patient cohorts, resulting in a gap between locally reported model performance and cross-site generalizability. Different approaches have been introduced for developing models across multiple clinical sites, however less attention has been given to adopting ready-made models in new settings. We introduce three methods to do this—(1) applying a ready-made model “as-is” (2); readjusting the decision threshold on the model’s output using site-specific data and (3); finetuning the model using site-specific data via transfer learning. Using a case study of COVID-19 diagnosis across four NHS Hospital Trusts, we show that all methods achieve clinically-effective performances (NPV > 0.959), with transfer learning achieving the best results (mean AUROCs between 0.870 and 0.925). Our models demonstrate that site-specific customization improves predictive performance when compared to other ready-made approaches.

Cite

CITATION STYLE

APA

Yang, J., Soltan, A. A. S., & Clifton, D. A. (2022). Machine learning generalizability across healthcare settings: insights from multi-site COVID-19 screening. Npj Digital Medicine, 5(1). https://doi.org/10.1038/s41746-022-00614-9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free