YuYin: a multi-task learning model of multi-modal e-commerce background music recommendation

1Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Appropriate background music in e-commerce advertisements can help stimulate consumption and build product image. However, many factors like emotion and product category should be taken into account, which makes manually selecting music time-consuming and require professional knowledge and it becomes crucial to automatically recommend music for video. For there is no e-commerce advertisements dataset, we first establish a large-scale e-commerce advertisements dataset Commercial-98K, which covers major e-commerce categories. Then, we proposed a video-music retrieval model YuYin to learn the correlation between video and music. We introduce a weighted fusion module (WFM) to fuse emotion features and audio features from music to get a more fine-grained music representation. Considering the similarity of music in the same product category, YuYin is trained by multi-task learning to explore the correlation between video and music by cross-matching video, music, and tag as well as a category prediction task. We conduct extensive experiments to prove YuYin achieves a remarkable improvement in video-music retrieval on Commercial-98K.

Cite

CITATION STYLE

APA

Ma, L., Wu, X., Tang, R., Zhong, C., & Zhang, K. (2023). YuYin: a multi-task learning model of multi-modal e-commerce background music recommendation. Eurasip Journal on Audio, Speech, and Music Processing, 2023(1). https://doi.org/10.1186/s13636-023-00306-6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free