Validation of an artificial intelligence model for 12-lead ECG interpretation

  • Demolder A
  • Herman R
  • Vavrik B
  • et al.
N/ACitations
Citations of this article
8Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The electrocardiogram (ECG) is one of the most accessible and comprehensive diagnostic tools to assess cardiac abnormalities. However, automated ECG interpretation remains inferior to physician interpretation in terms of accuracy and reliability.This study evaluated the accuracy of an AI-powered ECG model in providing a precise diagnosis of 12-lead ECGs and compared its diagnostic performance to primary care physicians and cardiologists through extensive benchmarking.A deep neural network (DNN) was trained on standard 12-lead ECGs to detect 38 diagnoses (grouped into 6 categories: rhythm, conduction abnormalities, chamber enlargement, infarction, ectopy, and axis), denoting the most common types of electrocardiographic abnormalities. Performance of AI-powered ECG diagnosis was evaluated on an independent test set annotated by consensus of two expert cardiologists. Benchmarking was performed against three individual primary care physicians and six individual cardiologists who independently annotated the same ECG test set. The key metrics used to compare performances were positive predictive value (PPV), negative predictive value (NPV), Sensitivity, Specificity, and F1 score.A total of 931,344 standard 12-lead ECGs from 172,750 patients were used to train a DNN. The independent test set had 11,932 annotated ECG labels. The model attained an overall mean F1 score of 0.921, sensitivity 0.910 (0.889–0.931), specificity 0.968 (0.954–0.981), PPV 0.939 (0.919–0.958), and NPV 0.965 (0.951–0.979) [Figure 1]. In all 6 diagnostic categories, the DNN achieved higher mean F1 scores than the mean cardiologist and primary care physician (Rhythm 0.951 vs. 0.892 vs. 0.734; Conduction abnormalities 0.883 vs. 0.824 vs. 0.693; Chamber enlargement 0.970 vs. 0.761 vs. 0.562; Infarction 0.918 vs. 0.853 vs. 0.781; Ectopy 0.966 vs. 0.951 vs. 0.897; Axis 0.909 vs. 0.644 vs. 0.528, respectively). The ability of the DNN to identify atrial fibrillation achieved nearly perfect performance (PPV of 0.989 and NPV of 0.990). Diagnostic performance surpassed primary care physicians and was non-inferior to cardiologists based on the F1 scores for all individual diagnoses.Our results demonstrate the AI-powered ECG model’s ability to accurately identify electrocardiographic abnormalities from the 12-lead ECG, showcasing its utility as clinical tool for healthcare professionals.

Cite

CITATION STYLE

APA

Demolder, A., Herman, R., Vavrik, B., Martonak, M., Boza, V., Herman, M., … Bartunek, J. (2023). Validation of an artificial intelligence model for 12-lead ECG interpretation. European Heart Journal, 44(Supplement_2). https://doi.org/10.1093/eurheartj/ehad655.2932

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free