Weakly supervised segmentation models as explainable radiological classifiers for lung tumour detection on CT images

0Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Purpose: Interpretability is essential for reliable convolutional neural network (CNN) image classifiers in radiological applications. We describe a weakly supervised segmentation model that learns to delineate the target object, trained with only image-level labels (“image contains object” or “image does not contain object”), presenting a different approach towards explainable object detectors for radiological imaging tasks. Methods: A weakly supervised Unet architecture (WSUnet) was trained to learn lung tumour segmentation from image-level labelled data. WSUnet generates voxel probability maps with a Unet and then constructs an image-level prediction by global max-pooling, thereby facilitating image-level training. WSUnet’s voxel-level predictions were compared to traditional model interpretation techniques (class activation mapping, integrated gradients and occlusion sensitivity) in CT data from three institutions (training/validation: n = 412; testing: n = 142). Methods were compared using voxel-level discrimination metrics and clinical value was assessed with a clinician preference survey on data from external institutions. Results: Despite the absence of voxel-level labels in training, WSUnet’s voxel-level predictions localised tumours precisely in both validation (precision: 0.77, 95% CI: [0.76–0.80]; dice: 0.43, 95% CI: [0.39–0.46]), and external testing (precision: 0.78, 95% CI: [0.76–0.81]; dice: 0.33, 95% CI: [0.32–0.35]). WSUnet’s voxel-level discrimination outperformed the best comparator in validation (area under precision recall curve (AUPR): 0.55, 95% CI: [0.49–0.56] vs. 0.23, 95% CI: [0.21–0.25]) and testing (AUPR: 0.40, 95% CI: [0.38–0.41] vs. 0.36, 95% CI: [0.34–0.37]). Clinicians preferred WSUnet predictions in most instances (clinician preference rate: 0.72 95% CI: [0.68–0.77]). Conclusion: Weakly supervised segmentation is a viable approach by which explainable object detection models may be developed for medical imaging. Critical relevance statement: WSUnet learns to segment images at voxel level, training only with image-level labels. A Unet backbone first generates a voxel-level probability map and then extracts the maximum voxel prediction as the image-level prediction. Thus, training uses only image-level annotations, reducing human workload. WSUnet’s voxel-level predictions provide a causally verifiable explanation for its image-level prediction, improving interpretability. Key points: • Explainability and interpretability are essential for reliable medical image classifiers. • This study applies weakly supervised segmentation to generate explainable image classifiers. • The weakly supervised Unet inherently explains its image-level predictions at voxel level. Graphical Abstract: [Figure not available: see fulltext.].

Cite

CITATION STYLE

APA

O’Shea, R., Manickavasagar, T., Horst, C., Hughes, D., Cusack, J., Tsoka, S., … Goh, V. (2023). Weakly supervised segmentation models as explainable radiological classifiers for lung tumour detection on CT images. Insights into Imaging, 14(1). https://doi.org/10.1186/s13244-023-01542-2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free