Learning an adversarial network for speech enhancement under extremely low signal-to-noise ratio condition

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Speech enhancement under low Signal-to-noise ratio (SNR) condition is a challenging task. This paper formulates the speech enhancement as a spectrogram mapping problem that converts the noisy speech spectrogram to the clean speech spectrogram. On such basis, we propose a robust speech enhancement approach based on deep adversarial learning for extremely low SNR Condition. The deep adversarial network is trained on a few paired spectrograms of the noisy and the clean speeches, and several strategies are applied to optimize it, skip connection, patchGAN and spectral normalization. Our approach is evaluated under extremely low SNR conditions (the lowest SNR is −20 dB), and the result demonstrates that our approach significantly improves the speech quality and substantially outperforms the representative deep learning models, including DNN, SEGAN and Bidirectional LSTM using phase-sensitive spectrum approximation cost function (PSA-BLSTM) regarding Short-Time Objective Intelligibility (STOI) and Perceptual evaluation of speech quality (PESQ).

Cite

CITATION STYLE

APA

Su, X., Hao, X., Wang, Z., Liu, Y., Xu, H., Liu, T., … Feilong. (2019). Learning an adversarial network for speech enhancement under extremely low signal-to-noise ratio condition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11953 LNCS, pp. 86–97). Springer. https://doi.org/10.1007/978-3-030-36708-4_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free