PGMI assessment in mammography: AI software versus human readers

1Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Introduction: The aim of this study was to evaluate human inter-reader agreement of parameters included in PGMI (perfect-good-moderate-inadequate) classification of screening mammograms and explore the role of artificial intelligence (AI) as an alternative reader. Methods: Five radiographers from three European countries independently performed a PGMI assessment of 520 anonymized mammography screening examinations randomly selected from representative subsets from 13 imaging centres within two European countries. As a sixth reader, a dedicated AI software was used. Accuracy, Cohen's Kappa, and confusion matrices were calculated to compare the predictions of the software against the individual assessment of the readers, as well as potential discrepancies between them. A questionnaire and a personality test were used to better understand the decision-making processes of the human readers. Results: Significant inter-reader variability among human readers with poor to moderate agreement (κ = −0.018 to κ = 0.41) was observed, with some showing more homogenous interpretations of single features and overall quality than others. In comparison, the software surpassed human inter-reader agreement in detecting glandular tissue cuts, mammilla deviation, pectoral muscle detection, and pectoral angle measurement, while remaining features and overall image quality exhibited comparable performance to human assessment. Conclusion: Notably, human inter-reader disagreement of PGMI assessment in mammography is considerably high. AI software may already reliably categorize quality. Its potential for standardization and immediate feedback to achieve and monitor high levels of quality in screening programs needs further attention and should be included in future approaches. Implications for practice: AI has promising potential for automated assessment of diagnostic image quality. Faster, more representative and more objective feedback may support radiographers in their quality management processes. Direct transformation of common PGMI workflows into an AI algorithm could be challenging.

Cite

CITATION STYLE

APA

Santner, T., Ruppert, C., Gianolini, S., Stalheim, J. G., Frei, S., Hondl, M., … Widmann, G. (2025). PGMI assessment in mammography: AI software versus human readers. Radiography, 31(5). https://doi.org/10.1016/j.radi.2025.103017

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free