Automatic and accurate retinal disease diagnosis is critical to guide proper therapy and prevent potential vision loss. Previous works simply exploit the most discriminative features while ignoring the pathological visual clues of scattered subtle lesions. Therefore, without a comprehensive understanding of features from different lesion regions, they are vulnerable to noise from complex backgrounds and suffer from misclassification failures. In this paper, we address these limitations with a novel saliency-guided abnormality-aware transformer which explicitly captures the correlation between different lesion features from a global perspective with enhanced pathological semantics. The model has several merits. First, we propose a saliency enhancement module (SEM) which adaptively integrates disease related semantics and highlights potentially salient lesion regions. Second, to the best of our knowledge, this is the first work to explore comprehensive lesion feature dependencies via a tailored efficient self-attention. Third, with the saliency enhancement module and abnormality-aware attention, we propose a new variant of Vision Transformer models, called SatFormer, which outperforms the state-of-the-art methods on two public retinal disease classification benchmarks. Ablation study shows that the proposed components can be easily embedded into any Vision Transformers via a plug-and-play manner and effectively boost the performance.
CITATION STYLE
Jiang, Y., Xu, K., Wang, X., Li, Y., Cui, H., Tao, Y., & Lin, H. (2022). SatFormer: Saliency-Guided Abnormality-Aware Transformer for Retinal Disease Classification in Fundus Image. In IJCAI International Joint Conference on Artificial Intelligence (pp. 987–994). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2022/138
Mendeley helps you to discover research relevant for your work.