This paper introduces a new approach to dictionary-based source separation employing a learned non-linear metric. In contrast to existing parametric source separation systems, this model is able to utilize a rich dictionary of speech signals. In contrast to previous dictionary-based source separation systems, the system can utilize perceptually relevant non-linear features of the noisy and clean audio. This approach utilizes a deep neural network (DNN) to predict whether a noisy chunk of audio contains a given clean chunk. Speaker-dependent experiments on the small-vocabulary CHÎME2-GRID corpus show that this model is able to accurately resynthesize clean speech from noisy observations. Preliminary listening tests show that the system's output has much higher audio quality than existing parametric systems trained on the same data, achieving noise suppression levels close to those of the original clean speech.
CITATION STYLE
Mandel, M. I., Cho, Y. S., & Wang, Y. (2014). Learning a concatenative resynthesis system for noise suppression. In 2014 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2014 (pp. 582–586). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/GlobalSIP.2014.7032184
Mendeley helps you to discover research relevant for your work.