Abstract
IMPORTANCE Large language models (LLMs) have potential to increase the efficiency of information extraction from unstructured clinical notes in electronic medical records. OBJECTIVE To assess the utility and reliability of an LLM, ChatGPT-4 (OpenAI), to analyze clinical narratives and identify helmet use status of patients injured in micromobility-related accidents. DESIGN, SETTING, AND PARTICIPANTS This cross-sectional study used publicly available, deidentified 2019 to 2022 data from the US Consumer Product Safety Commission’s National Electronic Injury Surveillance System, a nationally representative stratified probability sample of 96 hospitals in the US. Unweighted estimates of e-bike, bicycle, hoverboard, and powered scooter-related injuries that resulted in an emergency department visit were used. Statistical analysis was performed from November 2023 to April 2024. MAIN OUTCOMES AND MEASURES Patient helmet status (wearing vs not wearing vs unknown) was extracted from clinical narratives using (1) a text string search using researcher-generated text strings and (2) the LLM by prompting the system with low-, intermediate-, and high-detail prompts. The level of agreement between the 2 approaches across all 3 prompts was analyzed using Cohen κ test statistics. Fleiss κ was calculated to measure the test-retest reliability of the high-detail prompt across 5 new chat sessions and days. Performance statistics were calculated by comparing results from the high-detail prompt to classifications of helmet status generated by researchers reading the clinical notes (ie, a criterion standard review). RESULTS Among 54 569 clinical notes, moderate (Cohen κ = 0.74 [95% CI, 0.73-0.75) and weak (Cohen κ = 0.53 [95% CI, 0.52-0.54]) agreement were found between the text string-search approach and the LLM for the low- and intermediate-detail prompts, respectively. The high-detail prompt had almost perfect agreement (κ = 1.00 [95% CI, 1.00-1.00]) but required the greatest amount of time to complete. The LLM did not perfectly replicate its analyses across new sessions and days (Fleiss κ = 0.91 across 5 trials; P
Cite
CITATION STYLE
Burford, K. G., Itzkowitz, N. G., Ortega, A. G., Teitler, J. O., & Rundle, A. G. (2024). Use of Generative AI to Identify Helmet Status Among Patients With Micromobility-Related Injuries From Unstructured Clinical Notes. JAMA Network Open, 7(8). https://doi.org/10.1001/jamanetworkopen.2024.25981
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.