Music emotion recognition using recurrent neural networks and pretrained models

17Citations
Citations of this article
22Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The article presents conducted experiments using recurrent neural networks for emotion detection in musical segments. Trained regression models were used to predict the continuous values of emotions on the axes of Russell’s circumplex model. A process of audio feature extraction and creating sequential data for learning networks with long short-term memory (LSTM) units is presented. Models were implemented using the WekaDeeplearning4j package and a number of experiments were carried out with data with different sets of features and varying segmentation. The usefulness of dividing the data into sequences as well as the point of using recurrent networks to recognize emotions in music, the results of which have even exceeded the SVM algorithm for regression, were demonstrated. The author analyzed the effect of the network structure and the set of used features on the results of the regressors recognizing values on two axes of the emotion model: arousal and valence. Finally, the use of a pretrained model for processing audio features and training a recurrent network with new sequences of features is presented.

Cite

CITATION STYLE

APA

Grekow, J. (2021). Music emotion recognition using recurrent neural networks and pretrained models. Journal of Intelligent Information Systems, 57(3), 531–546. https://doi.org/10.1007/s10844-021-00658-5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free