Temporal pyramid pooling convolutional neural network for cover song identification

32Citations
Citations of this article
34Readers
Mendeley users who have this article in their library.

Abstract

Cover song identification is an important problem in the field of Music Information Retrieval. Most existing methods rely on hand-crafted features and sequence alignment methods, and further breakthrough is hard to achieve. In this paper, Convolutional Neural Networks (CNNs) are used for representation learning toward this task. We show that they could be naturally adapted to deal with key transposition in cover songs. Additionally, Temporal Pyramid Pooling is utilized to extract information on different scales and transform songs with different lengths into fixed-dimensional representations. Furthermore, a training scheme is designed to enhance the robustness of our model. Extensive experiments demonstrate that combined with these techniques, our approach is robust against musical variations existing in cover songs and outperforms state-of-the-art methods on several datasets with low time complexity.

Cite

CITATION STYLE

APA

Yu, Z., Xu, X., Chen, X., & Yang, D. (2019). Temporal pyramid pooling convolutional neural network for cover song identification. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2019-August, pp. 4846–4852). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2019/673

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free