Enhanced Topic Modeling with Multi-modal Representation Learning

N/ACitations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Existing topic modelling methods primarily use text features to discover topics without considering other data modalities such as images. The recent advances in multi-modal representation learning show that the multi-modality features are useful to enhance the semantic information within the text data for downstream tasks. This paper proposes a novel Neural Topic Model framework in a multi-modal setting where visual and textual information are utilized to derive text-based topic models. The framework includes a Gated Data Fusion module to learn the textual-specific visual representations for generating contextualized multi-modality features. These features are then mapped into a joint latent space by using a Neural Topic Model to learn topic distributions. Experiments on diverse datasets show that the proposed framework improves topic quality significantly.

Cite

CITATION STYLE

APA

Zhang, D., Wang, Y., Bashar, M. A., & Nayak, R. (2023). Enhanced Topic Modeling with Multi-modal Representation Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13935 LNCS, pp. 393–404). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-33374-3_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free