Deep neural network based multichannel audio source separation

8Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This chapter presents a multichannel audio source separation framework where deep neural networks (DNNs) are used to model the source spectra and combined with the classical multichannel Gaussian model to exploit the spatial information. The parameters are estimated in an iterative expectation-maximization (EM) fashion and used to derive a multichannel Wiener filter. Different design choices and their impact on the performance are discussed. They include the cost functions for DNN training, the number of parameter updates, the use of multiple DNNs, and the use of weighted parameter updates. Finally, we present its application to a speech enhancement task and a music separation task. The experimental results show the benefit of the multichannel DNN-based approach over a single-channel DNN-based approach and the multichannel nonnegative matrix factorization based iterative EM framework.

Cite

CITATION STYLE

APA

Nugraha, A. A., Liutkus, A., & Vincent, E. (2018). Deep neural network based multichannel audio source separation. In Signals and Communication Technology (pp. 157–185). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-319-73031-8_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free