Audio source separation is an important but challenging problem for many applications due to the only available single channel mixed signal. This work proposes a novel Non-Local Multi-scale Multi-band DenseNet model termed as NLMMDenseNet for audio source separation by jointly exploring the long-term dependencies and recovering the missing information around bands’ borders. Specifically, to well leverage the long-term dependencies among the audio spectrogram, we propose a new non-local model by incorporating the non-local layer into MMDenseNet. It enables the proposed model to capture different audio sources features. Besides, the proposed model can also capture cross-band features, which are used to recover the missing information around bands’ borders. The proposed model outperforms state-of-the-art results on the widely-used MIR-1K and DSD100 datasets by taking advantages of global information and bands’ border information.
CITATION STYLE
Huang, Y. (2019). Non-local MMDenseNet with Cross-Band Features for Audio Source Separation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11936 LNCS, pp. 53–64). Springer. https://doi.org/10.1007/978-3-030-36204-1_4
Mendeley helps you to discover research relevant for your work.