Evaluating the quality of visual explanations on chest X-ray images for thorax diseases classification

3Citations
Citations of this article
43Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Deep learning models are extensively used but often lack transparency due to their complex internal mechanics. To bridge this gap, the field of explainable AI (XAI) strives to make these models more interpretable. However, a significant obstacle in XAI is the absence of quantifiable metrics for evaluating explanation quality. Existing techniques, reliant on manual assessment or inadequate metrics, face limitations in scalability, reproducibility, and trustworthiness. Recognizing these issues, the current study specifically addresses the quality assessment of visual explanations in medical imaging, where interpretability profoundly influences diagnostic accuracy and trust in AI-assisted decisions. Introducing novel criteria such as informativeness, localization, coverage, multi-target capturing, and proportionality, this work presents a comprehensive method for the objective assessment of various explainability algorithms. These newly introduced criteria aid in identifying optimal evaluation metrics. The study expands the domain’s analytical toolkit by examining existing metrics, which have been prevalent in recent works for similar applications, and proposing new ones. Rigorous analysis led to selecting Jensen–Shannon divergence (JS_DIV) as the most effective metric for visual explanation quality. Applied to the multi-label, multi-class diagnosis of thoracic diseases using a trained classifier on the CheXpert dataset, local interpretable model-agnostic explanations (LIME) with diverse segmentation strategies interpret the classifier’s decisions. A qualitative analysis on an unseen subset of the VinDr-CXR dataset evaluates these metrics, confirming JS_DIV’s superiority. The subsequent quantitative analysis optimizes LIME’s hyper-parameters and benchmarks its performance across various segmentation algorithms, underscoring the utility of an objective assessment metric in practical applications.

Cite

CITATION STYLE

APA

Rahimiaghdam, S., & Alemdar, H. (2024). Evaluating the quality of visual explanations on chest X-ray images for thorax diseases classification. Neural Computing and Applications, 36(17), 10239–10255. https://doi.org/10.1007/s00521-024-09587-0

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free