An Effective Deep Transfer Learning and Information Fusion Framework for Medical Visual Question Answering

Feifan Liu; Yalei Peng; Max P. Rosen

Conference Proceedings

An Effective Deep Transfer Learning and Information Fusion Framework for Medical Visual Question Answering

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11696 LNCS 238-247

DOI: 10.1007/978-3-030-28577-7_20

4Citations

13Readers

Get full text

Abstract

Medical visual question answering (Med-VQA) is very important for better clinical decision support and enhanced patient engagement in patient-centered medical care. Compared with open domain VQA tasks, VQA in medical domain becomes more challenging due to limited training resources as well as unique characteristics on medical images and domain vocabularies. In this paper, we propose and develop a novel deep transfer learning model, ETM-Trans, which exploits embedding topic modeling (ETM) on textual questions to derive topic labels to pair with associated medical images for finetuning the pre-trained ImageNet model. We also explore and implement a co-attention mechanism where residual networks is used to extract visual features from image interacting with the long-short term memory (LSTM) based question representation providing fine-grained contextual information for answer derivation. To efficiently integrate visual features from the image and textual features from the question, we employ Multimodal Factorized Bilinear (MFB) pooling as well as Multimodal Factorized High-order (MFH) pooling. The ETM-Trans model won the international Med-VQA 2018 challenge, achieving the best WBSS score of 0.186.

Author supplied keywords

Cite

CITATION STYLE

APA

Liu, F., Peng, Y., & Rosen, M. P. (2019). An Effective Deep Transfer Learning and Information Fusion Framework for Medical Visual Question Answering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11696 LNCS, pp. 238–247). Springer Verlag. https://doi.org/10.1007/978-3-030-28577-7_20

An Effective Deep Transfer Learning and Information Fusion Framework for Medical Visual Question Answering

Abstract

Author supplied keywords

Cite

Register to see more suggestions