Cross-modal self-attention with multi-task pre-training for medical visual question answering

Haifan Gong; Guanqi Chen; Sishuo Liu; Yizhou Yu; Guanbin Li

Conference ProceedingsOPEN ACCESS

Cross-modal self-attention with multi-task pre-training for medical visual question answering

ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval (2021) 456-460

DOI: 10.1145/3460426.3463584

72Citations

17Readers

Get full text

Abstract

Due to the severe lack of labeled data, existing methods of medical visual question answering usually rely on transfer learning to obtain effective image feature representation and use cross-modal fusion of visual and linguistic features to achieve question-related answer prediction. These two phases are performed independently and without considering the compatibility and applicability of the pre-trained features for cross-modal fusion. Thus, we reformulate image feature pre-training as a multi-task learning paradigm and witness its extraordinary superiority, forcing it to take into account the applicability of features for the specific image comprehension task. Furthermore, we introduce a cross-modal self-attention∼(CMSA) module to selectively capture the long-range contextual relevance for more effective fusion of visual and linguistic features. Experimental results demonstrate that the proposed method outperforms existing state-of-the-art methods. Our code and models are available at https://github.com/haifangong/CMSA-MTPT-4-MedicalVQA.

Author supplied keywords

Cite

CITATION STYLE

APA

Gong, H., Chen, G., Liu, S., Yu, Y., & Li, G. (2021). Cross-modal self-attention with multi-task pre-training for medical visual question answering. In ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval (pp. 456–460). Association for Computing Machinery, Inc. https://doi.org/10.1145/3460426.3463584

Cross-modal self-attention with multi-task pre-training for medical visual question answering

Abstract

Author supplied keywords

Cite

Register to see more suggestions