Replication of Real-World Evidence in Oncology Using Electronic Health Record Data Extracted by Machine Learning

Corey M. Benedum; Arjun Sondhi; Erin Fidyk; Aaron B. Cohen; Sheila Nemeth; Blythe Adamson; Melissa Estévez; Selen Bozkurt

Journal ArticleOPEN ACCESS

Replication of Real-World Evidence in Oncology Using Electronic Health Record Data Extracted by Machine Learning

Cancers (2023) 15(6)

DOI: 10.3390/cancers15061853

16Citations

39Readers

Abstract

Meaningful real-world evidence (RWE) generation requires unstructured data found in electronic health records (EHRs) which are often missing from administrative claims; however, obtaining relevant data from unstructured EHR sources is resource-intensive. In response, researchers are using natural language processing (NLP) with machine learning (ML) techniques (i.e., ML extraction) to extract real-world data (RWD) at scale. This study assessed the quality and fitness-for-use of EHR-derived oncology data curated using NLP with ML as compared to the reference standard of expert abstraction. Using a sample of 186,313 patients with lung cancer from a nationwide EHR-derived de-identified database, we performed a series of replication analyses demonstrating some common analyses conducted in retrospective observational research with complex EHR-derived data to generate evidence. Eligible patients were selected into biomarker- and treatment-defined cohorts, first with expert-abstracted then with ML-extracted data. We utilized the biomarker- and treatment-defined cohorts to perform analyses related to biomarker-associated survival and treatment comparative effectiveness, respectively. Across all analyses, the results differed by less than 8% between the data curation methods, and similar conclusions were reached. These results highlight that high-performance ML-extracted variables trained on expert-abstracted data can achieve similar results as when using abstracted data, unlocking the ability to perform oncology research at scale.

Author supplied keywords

Cite

CITATION STYLE

APA

Benedum, C. M., Sondhi, A., Fidyk, E., Cohen, A. B., Nemeth, S., Adamson, B., … Bozkurt, S. (2023). Replication of Real-World Evidence in Oncology Using Electronic Health Record Data Extracted by Machine Learning. Cancers, 15(6). https://doi.org/10.3390/cancers15061853

Replication of Real-World Evidence in Oncology Using Electronic Health Record Data Extracted by Machine Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions