Deep Multimodal Clustering with Cross Reconstruction

5Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Recently, there has been surging interests in multimodal clustering. And extracting common features plays a critical role in these methods. However, since the ignorance of the fact that data in different modalities shares similar distributions in feature space, most works did not mining the inter-modal distribution relationships completely, which eventually leads to unacceptable common features. To address this issue, we propose the deep multimodal clustering with cross reconstruction method, which firstly focuses on multimodal feature extraction in an unsupervised way and then clusters these extracted features. The proposed cross reconstruction aims to build latent connections among different modalities, which effectively reduces the distribution differences in feature space. The theoretical analysis shows that the cross reconstruction reduces the Wasserstein distance of multimodal feature distributions. Experimental results on six benchmark datasets demonstrate that our method achieves obviously improvement over several state-of-arts.

Cite

CITATION STYLE

APA

Zhang, X., Tang, X., Zong, L., Liu, X., & Mu, J. (2020). Deep Multimodal Clustering with Cross Reconstruction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12084 LNAI, pp. 305–317). Springer. https://doi.org/10.1007/978-3-030-47426-3_24

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free