Objectives: To evaluate if artificial intelligence (AI) can discriminate recalled benign from recalled malignant mammographic screening abnormalities to improve screening performance. Methods: A total of 2257 full-field digital mammography screening examinations, obtained 2011–2013, of women aged 50–69 years which were recalled for further assessment of 295 malignant out of 305 truly malignant lesions and 2289 benign lesions after independent double-reading with arbitration, were included in this retrospective study. A deep learning AI system was used to obtain a score (0–95) for each recalled lesion, representing the likelihood of breast cancer. The sensitivity on the lesion level and the proportion of women without false-positive ratings (non-FPR) resulting under AI were estimated as a function of the classification cutoff and compared to that of human readers. Results: Using a cutoff of 1, AI decreased the proportion of women with false-positives from 89.9 to 62.0%, non-FPR 11.1% vs. 38.0% (difference 26.9%, 95% confidence interval 25.1–28.8%; p
CITATION STYLE
Kerschke, L., Weigel, S., Rodriguez-Ruiz, A., Karssemeijer, N., & Heindel, W. (2022). Using deep learning to assist readers during the arbitration process: a lesion-based retrospective evaluation of breast cancer screening performance. European Radiology, 32(2), 842–852. https://doi.org/10.1007/s00330-021-08217-w
Mendeley helps you to discover research relevant for your work.