A Multimodal Variational Encoder-Decoder Framework for Micro-video Popularity Prediction

Jiayi Xie; Yaochen Zhu; Zhibin Zhang; Jian Peng; Jing Yi; Yaosi Hu; Hongyi Liu; Zhenzhong Chen

Conference ProceedingsOPEN ACCESS

A Multimodal Variational Encoder-Decoder Framework for Micro-video Popularity Prediction

The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020 (2020) 2542-2548

DOI: 10.1145/3366423.3380004

36Citations

37Readers

Get full text

Abstract

Predicting the popularity of a micro-video is a challenging task, due to a number of factors impacting the distribution such as the diversity of the video content and user interests, complex online interactions, etc. In this paper, we propose a multimodal variational encoder-decoder (MMVED) framework that considers the uncertain factors as the randomness for the mapping from the multimodal features to the popularity. Specifically, the MMVED first encodes features from multiple modalities in the observation space into latent representations and learns their probability distributions based on variational inference, where only relevant features in the input modalities can be extracted into the latent representations. Then, the modality-specific hidden representations are fused through Bayesian reasoning such that the complementary information from all modalities is well utilized. Finally, a temporal decoder implemented as a recurrent neural network is designed to predict the popularity sequence of a certain micro-video. Experiments conducted on a real-world dataset demonstrate the effectiveness of our proposed model in the micro-video popularity prediction task.

Author supplied keywords

Cite

CITATION STYLE

APA

Xie, J., Zhu, Y., Zhang, Z., Peng, J., Yi, J., Hu, Y., … Chen, Z. (2020). A Multimodal Variational Encoder-Decoder Framework for Micro-video Popularity Prediction. In The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020 (pp. 2542–2548). Association for Computing Machinery, Inc. https://doi.org/10.1145/3366423.3380004

A Multimodal Variational Encoder-Decoder Framework for Micro-video Popularity Prediction

Abstract

Author supplied keywords

Cite

Register to see more suggestions