Deep Multimodal Clustering with Cross Reconstruction

Xianchao Zhang; Xiaorui Tang; Linlin Zong; Xinyue Liu; Jie Mu

Conference ProceedingsOPEN ACCESS

Deep Multimodal Clustering with Cross Reconstruction

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12084 LNAI 305-317

DOI: 10.1007/978-3-030-47426-3_24

5Citations

18Readers

Abstract

Recently, there has been surging interests in multimodal clustering. And extracting common features plays a critical role in these methods. However, since the ignorance of the fact that data in different modalities shares similar distributions in feature space, most works did not mining the inter-modal distribution relationships completely, which eventually leads to unacceptable common features. To address this issue, we propose the deep multimodal clustering with cross reconstruction method, which firstly focuses on multimodal feature extraction in an unsupervised way and then clusters these extracted features. The proposed cross reconstruction aims to build latent connections among different modalities, which effectively reduces the distribution differences in feature space. The theoretical analysis shows that the cross reconstruction reduces the Wasserstein distance of multimodal feature distributions. Experimental results on six benchmark datasets demonstrate that our method achieves obviously improvement over several state-of-arts.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhang, X., Tang, X., Zong, L., Liu, X., & Mu, J. (2020). Deep Multimodal Clustering with Cross Reconstruction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12084 LNAI, pp. 305–317). Springer. https://doi.org/10.1007/978-3-030-47426-3_24

Deep Multimodal Clustering with Cross Reconstruction

Abstract

Author supplied keywords

Cite

Register to see more suggestions