Using electronic health records to accurately predict COVID-19 health outcomes through a novel machine learning pipeline

1Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Current COVID-19 predictive models primarily focus on predicting the risk of mortality, and rely on COVID-19 specific medical data such as chest imaging after COVID-19 diagnosis. In this project, we developed an innovative supervised machine learning pipeline using longitudinal Electronic Health Records (EHR) to accurately predict COVID-19 related health outcomes including mortality, ventilation, days in hospital or ICU. In particular, we developed unique and effective data processing algorithms, including data cleaning, initial feature screening, vector representation, and feature normalization. Then we trained models using state-of-the-art machine learning strategies combined with different parameter settings and feature selection. Based on routinely collected EHR, our machine learning pipeline not only consistently outperformed those developed by other research groups using the same data set, but also achieved similar mortality prediction accuracy as those trained on medical data available only after COVID-19 diagnosis. In addition, we identified top COVID-19 risk factors, which are consistent with epidemiologic findings.

Cite

CITATION STYLE

APA

Feng, A. (2021). Using electronic health records to accurately predict COVID-19 health outcomes through a novel machine learning pipeline. In Proceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2021. Association for Computing Machinery, Inc. https://doi.org/10.1145/3459930.3469490

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free