Multi-resolution fully convolutional neural networks for monaural audio source separation

15Citations
Citations of this article
33Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In deep neural networks with convolutional layers, all the neurons in each layer typically have the same size receptive fields (RFs) with the same resolution. Convolutional layers with neurons that have large RF capture global information from the input features, while layers with neurons that have small RF size capture local details with high resolution from the input features. In this work, we introduce novel deep multi-resolution fully convolutional neural networks (MR-FCN), where each layer has a range of neurons with different RF sizes to extract multi-resolution features that capture the global and local information from its input features. The proposed MR-FCN is applied to separate the singing voice from mixtures of music sources. Experimental results show that using MR-FCN improves the performance compared to feedforward deep neural networks (DNNs) and single resolution deep fully convolutional neural networks (FCNs) on the audio source separation problem.

Cite

CITATION STYLE

APA

Grais, E. M., Wierstorf, H., Ward, D., & Plumbley, M. D. (2018). Multi-resolution fully convolutional neural networks for monaural audio source separation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10891 LNCS, pp. 340–350). Springer Verlag. https://doi.org/10.1007/978-3-319-93764-9_32

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free