Enhanced Topic Modeling with Multi-modal Representation Learning

Duoyi Zhang; Yue Wang; Md Abul Bashar; Richi Nayak

Conference Proceedings

Enhanced Topic Modeling with Multi-modal Representation Learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2023) 13935 LNCS 393-404

DOI: 10.1007/978-3-031-33374-3_31

N/ACitations

2Readers

Get full text

Abstract

Existing topic modelling methods primarily use text features to discover topics without considering other data modalities such as images. The recent advances in multi-modal representation learning show that the multi-modality features are useful to enhance the semantic information within the text data for downstream tasks. This paper proposes a novel Neural Topic Model framework in a multi-modal setting where visual and textual information are utilized to derive text-based topic models. The framework includes a Gated Data Fusion module to learn the textual-specific visual representations for generating contextualized multi-modality features. These features are then mapped into a joint latent space by using a Neural Topic Model to learn topic distributions. Experiments on diverse datasets show that the proposed framework improves topic quality significantly.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhang, D., Wang, Y., Bashar, M. A., & Nayak, R. (2023). Enhanced Topic Modeling with Multi-modal Representation Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13935 LNCS, pp. 393–404). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-33374-3_31

Enhanced Topic Modeling with Multi-modal Representation Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions