Multimodal Music Emotion Recognition Method Based on the Combination of Knowledge Distillation and Transfer Learning

7Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

The main difficulty of music emotion recognition is the lack of sufficient labeled data. Only the labeled data with unbalanced categories are used to train the emotion recognition model. Not only is accurate labeling of emotion categories costly and time-consuming, but it also requires extensive musical background for labelers At the same time, the emotion of music is often affected by many factors. Singing methods, music styles, arrangement methods, lyrics, and other factors will affect the expression of music emotions. This paper proposes a multimodal method based on the combination of knowledge distillation and music style transfer learning and verifies the effectiveness of the method on 20,000 songs. Experiments show that compared with traditional methods, such as single audio, single lyric, and single audio with multimodal lyric methods, the method proposed in this paper has significantly improved the accuracy of emotion recognition, and the generalization ability has been significantly improved.

Cite

CITATION STYLE

APA

Tong, G. (2022). Multimodal Music Emotion Recognition Method Based on the Combination of Knowledge Distillation and Transfer Learning. Scientific Programming, 2022. https://doi.org/10.1155/2022/2802573

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free