Extracting forced vital capacity from the electronic health record through natural language processing in rheumatoid arthritis-associated interstitial lung disease

3Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Purpose: To develop a natural language processing (NLP) tool to extract forced vital capacity (FVC) values from electronic health record (EHR) notes in patients with rheumatoid arthritis-interstitial lung disease (RA-ILD). Methods: We selected RA-ILD patients (n = 7485) in the Veterans Health Administration (VA) between 2000 and 2020 using validated ICD-9/10 codes. We identified numeric values in proximity to FVC string patterns from clinical notes in the EHR. Subsequently, we performed processing steps to account for variability in note structure, related pulmonary function test (PFT) output, and values copied across notes, then assigned dates from linked administrative procedure records. NLP-derived FVC values were compared to values recorded directly from PFT equipment available on a subset of patients. Results: We identified 5911 FVC values (n = 1844 patients) from PFT equipment and 15 383 values (n = 4982 patients) by NLP. Among 2610 date-matched FVC values from NLP and PFT equipment, 95.8% of values were within 5% predicted. The mean (SD) difference was 0.09% (5.9), and values strongly correlated (r = 0.94, p < 0.001), with a precision of 0.87 (95% CI 0.86, 0.88). NLP captured more patients with longitudinal FVC values (n = 3069 vs. n = 1164). Mean (SD) change in FVC %-predicted per year was similar between sources (−1.5 [30.0] NLP vs. −0.9 [16.6] PFT equipment; standardized response mean = 0.05 for both). Conclusions: NLP of EHR notes increases the capture of accurate, longitudinal FVC values by three-fold over PFT equipment. Use of this NLP tool can facilitate pharmacoepidemiologic research in RA-ILD and other lung diseases by capturing this critical measure of disease severity.

Cite

CITATION STYLE

APA

England, B. R., Roul, P., Yang, Y., Hershberger, D., Sayles, H., Rojas, J., … Mikuls, T. R. (2024). Extracting forced vital capacity from the electronic health record through natural language processing in rheumatoid arthritis-associated interstitial lung disease. Pharmacoepidemiology and Drug Safety, 33(1). https://doi.org/10.1002/pds.5744

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free