Learning an adversarial network for speech enhancement under extremely low signal-to-noise ratio condition

Xiangdong Su; Xiang Hao; Zhiyu Wang; Yun Liu; Huali Xu; Tongyang Liu; Guanglai Gao; undefined Feilong

Conference Proceedings

Learning an adversarial network for speech enhancement under extremely low signal-to-noise ratio condition

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11953 LNCS 86-97

DOI: 10.1007/978-3-030-36708-4_8

1Citations

5Readers

Get full text

Abstract

Speech enhancement under low Signal-to-noise ratio (SNR) condition is a challenging task. This paper formulates the speech enhancement as a spectrogram mapping problem that converts the noisy speech spectrogram to the clean speech spectrogram. On such basis, we propose a robust speech enhancement approach based on deep adversarial learning for extremely low SNR Condition. The deep adversarial network is trained on a few paired spectrograms of the noisy and the clean speeches, and several strategies are applied to optimize it, skip connection, patchGAN and spectral normalization. Our approach is evaluated under extremely low SNR conditions (the lowest SNR is −20 dB), and the result demonstrates that our approach significantly improves the speech quality and substantially outperforms the representative deep learning models, including DNN, SEGAN and Bidirectional LSTM using phase-sensitive spectrum approximation cost function (PSA-BLSTM) regarding Short-Time Objective Intelligibility (STOI) and Perceptual evaluation of speech quality (PESQ).

Author supplied keywords

Cite

CITATION STYLE

APA

Su, X., Hao, X., Wang, Z., Liu, Y., Xu, H., Liu, T., … Feilong. (2019). Learning an adversarial network for speech enhancement under extremely low signal-to-noise ratio condition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11953 LNCS, pp. 86–97). Springer. https://doi.org/10.1007/978-3-030-36708-4_8

Learning an adversarial network for speech enhancement under extremely low signal-to-noise ratio condition

Abstract

Author supplied keywords

Cite

Register to see more suggestions