Abstract
Computer-aided pathology diagnosis based on whole slide images, which is often formulated as a weakly supervised multiple instance learning (MIL) paradigm. Current approaches generally employ attention mechanisms to aggregate instance-level features. However, the weakly supervised signal and the imbalanced instance distribution often lead to inaccurate attention localization, compromising the performance and generalization capability of the MIL framework. To address these problems, this paper presents a novel MIL framework called FAMIL that focuses on inaccurate attention and refines them. FAMIL adopts a dual-branch structure and incorporates two innovative online data augmentation strategies: attention-based Mixup (ABMix) and attention-based Masking (ABMask). ABMix emphasizes the significance of positive instances, generalizing Mixup in the MIL scenarios, while ABMask flexibly identifies challenging positive instances to optimize the feature representation. Moreover, these two methods are plug-and-play and can be easily embedded into attention-based MIL methods. Extensive experiments on three public benchmarks demonstrate the superiority of our FAMIL, outperforming current state-of-the-art methods. The test AUC for the binary tumor classification can be up to 92.61% over CAMELYON16. And the AUC over the cancer subtype classification can be up to 93.81% and 98.41% on TCGA-NSCLC and TCGA-RCC datasets, respectively.
Author supplied keywords
Cite
CITATION STYLE
Cheng, H., Huang, S., Cai, L., Xu, Y., Wang, R., & Zhang, Y. (2025). Focus Your Attention: Multiple Instance Learning With Attention Modification for Whole Slide Pathological Image Classification. IEEE Transactions on Circuits and Systems for Video Technology, 35(6), 5791–5804. https://doi.org/10.1109/TCSVT.2025.3528625
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.