Algorithmic transparency and interpretability measures improve radiologists’ performance in BI-RADS 4 classification

Friederike Jungmann; Sebastian Ziegelmayer; Fabian K. Lohoefer; Stephan Metz; Christina Müller-Leisse; Maximilian Englmaier; Marcus R. Makowski; Georgios A. Kaissis; Rickmer F. Braren

Journal ArticleOPEN ACCESS

Algorithmic transparency and interpretability measures improve radiologists’ performance in BI-RADS 4 classification

European Radiology (2023) 33(3) 1844-1851

DOI: 10.1007/s00330-022-09165-9

8Citations

55Readers

Abstract

Objective: To evaluate the perception of different types of AI-based assistance and the interaction of radiologists with the algorithm’s predictions and certainty measures. Methods: In this retrospective observer study, four radiologists were asked to classify Breast Imaging-Reporting and Data System 4 (BI-RADS4) lesions (n = 101 benign, n = 99 malignant). The effect of different types of AI-based assistance (occlusion-based interpretability map, classification, and certainty) on the radiologists’ performance (sensitivity, specificity, questionnaire) were measured. The influence of the Big Five personality traits was analyzed using the Pearson correlation. Results: Diagnostic accuracy was significantly improved by AI-based assistance (an increase of 2.8% ± 2.3%, 95 %-CI 1.5 to 4.0 %, p = 0.045) and trust in the algorithm was generated primarily by the certainty of the prediction (100% of participants). Different human-AI interactions were observed ranging from nearly no interaction to humanization of the algorithm. High scores in neuroticism were correlated with higher persuasibility (Pearson’s r = 0.98, p = 0.02), while higher consciousness and change of accuracy showed an inverse correlation (Pearson’s r = −0.96, p = 0.04). Conclusion: Trust in the algorithm’s performance was mostly dependent on the certainty of the predictions in combination with a plausible heatmap. Human-AI interaction varied widely and was influenced by personality traits. Key Points: • AI-based assistance significantly improved the diagnostic accuracy of radiologists in classifying BI-RADS 4 mammography lesions. • Trust in the algorithm’s performance was mostly dependent on the certainty of the prediction in combination with a reasonable heatmap. • Personality traits seem to influence human-AI collaboration. Radiologists with specific personality traits were more likely to change their classification according to the algorithm’s prediction than others.

Author supplied keywords

Cite

CITATION STYLE

APA

Jungmann, F., Ziegelmayer, S., Lohoefer, F. K., Metz, S., Müller-Leisse, C., Englmaier, M., … Braren, R. F. (2023). Algorithmic transparency and interpretability measures improve radiologists’ performance in BI-RADS 4 classification. European Radiology, 33(3), 1844–1851. https://doi.org/10.1007/s00330-022-09165-9

Algorithmic transparency and interpretability measures improve radiologists’ performance in BI-RADS 4 classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions