With the popularity of online social media, massive-scale multimodal information has brought new challenges to traditional Named Entity Disambiguation (NED) tasks. Recently, Multimodal Named Entity Disambiguation (MNED) has been proposed to link ambiguous mentions with the textual and visual contexts to a predefined knowledge graph. Existing attempts usually perform MNED by annotating multimodal mentions and adding multimodal features to traditional NED models. However, these studies may suffer from 1) failing to model multimodal information at the knowledge level, and 2) lacking multimodal annotation data against the large-scale unlabeled corpus. In this paper, we explore a pioneer study on leveraging multimodal knowledge learning to address the MNED task. Specifically, we first harvest multimodal knowledge in the Meta-Learning way, which is much easier than collecting ambiguous mention corpus. Then we design a knowledge-guided transfer learning strategy to extract unified representation from different modalities. Finally, we propose an Interactive Multimodal Learning Network (IMN) to fully utilize the multimodal information on both the mention and knowledge sides. Extensive experiments conducted on two public MNED datasets demonstrate that the proposed method achieves improvements over the state-of-the-art multimodal methods.
CITATION STYLE
Zhang, D., Huang, L., Ma, T., & Xue, H. (2022). Multimodal Knowledge Learning for Named Entity Disambiguation. In Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 3160–3169). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-emnlp.230
Mendeley helps you to discover research relevant for your work.