A hierarchical attentive deep neural network model for semantic music annotation integrating multiple music representations

Qianqian Wang; Feng Su; Yuyang Wang

Conference ProceedingsOPEN ACCESS

A hierarchical attentive deep neural network model for semantic music annotation integrating multiple music representations

ICMR 2019 - Proceedings of the 2019 ACM International Conference on Multimedia Retrieval (2019) 150-158

DOI: 10.1145/3323873.3325031

5Citations

12Readers

Abstract

Automatically assigning a group of appropriate semantic tags to one music piece provides an effective way for people to efficiently utilize the massive and ever increasing on-line and off-line music data. In this paper, we propose a novel content-based automatic music annotation model that hierarchically combines attentive convolutional networks and recurrent networks for music representation learning, structure modelling and tag prediction. The model first exploits two separate attentive convolutional networks composed of multiple gated linear units (GLUs) to learn effective representations from both 1-D raw waveform signals and 2-D Mel-spectrogram of the music, which better captures informative features of the music for the annotation task than exploiting any single representation channel. The model then exploits bidirectional Long Short-Term Memory (LSTM) networks to depict the time-varying structures embedded in the description sequences of the music, and further introduces a dual-state LSTM network to encode temporal correlations between two representation channels, which effectively enriches the descriptions of the music. Finally, the model adaptively aggregates music descriptions generated at every time step with a self-attentive multi-weighting mechanism for music tag prediction. The proposed model achieves state-of-the-art results on the public MagnaTagATune music dataset, demonstrating its effectiveness on music annotation.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, Q., Su, F., & Wang, Y. (2019). A hierarchical attentive deep neural network model for semantic music annotation integrating multiple music representations. In ICMR 2019 - Proceedings of the 2019 ACM International Conference on Multimedia Retrieval (pp. 150–158). Association for Computing Machinery, Inc. https://doi.org/10.1145/3323873.3325031

A hierarchical attentive deep neural network model for semantic music annotation integrating multiple music representations

Abstract

Author supplied keywords

Cite

Register to see more suggestions