Automatic Lyrics Transcription of Polyphonic Music With Lyrics-Chord Multi-Task Learning

33Citations
Citations of this article
23Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Lyrics are the words that make up a song, while chords are harmonic sets of multiple notes in music. Lyrics and chords are generally essential information in music, i.e. unaccompanied singing vocals mixed with instrumental music, representing important components in polyphonic music. In a traditional lyrics transcription task, we first extract the singing vocals from the polyphonic music and then transcribe the resulting singing vocals, where the two steps are optimized independently. In this paper, we propose novel end-to-end network architectures that are designed to disentangle lyrics from chords in polyphonic music for effective lyrics transcription in a single step, where we consider chords as musical words, analogously to lexical words as lyrics intuitively. We start by studying a single-task lyrics transcriber as the reference baseline and the initial model to develop the multi-task lyrics transcription solutions. The main idea is to take advantage of chord transcription available in the training data through multi-task training to improve lyrics transcription. The experiments show that the proposed multitask lyrics transcriber significantly outperforms other competing solutions, with a word error rate (WER) of 31.82% on a standard test dataset.

Cite

CITATION STYLE

APA

Gao, X., Gupta, C., & Li, H. (2022). Automatic Lyrics Transcription of Polyphonic Music With Lyrics-Chord Multi-Task Learning. IEEE/ACM Transactions on Audio Speech and Language Processing, 30, 2280–2294. https://doi.org/10.1109/TASLP.2022.3190742

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free