Teacher-student learning and post-processing for robust bilstm mask-based acoustic beamforming

Zhaoyi Liu; Qiuyuan Chen; Han Hu; Haoyu Tang; Y. X. Zou

Conference Proceedings

Teacher-student learning and post-processing for robust bilstm mask-based acoustic beamforming

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11955 LNCS 522-533

DOI: 10.1007/978-3-030-36718-3_44

N/ACitations

4Readers

Get full text

Abstract

In real-world environments, automatic speech recognition (ASR) is highly affected by reverberation and background noise. A well-known strategy to reduce such adverse interferences in multi-microphone scenarios is microphone array acoustic beamforming. Recently, time-frequency (T-F) mask-based acoustic beamforming receives tremendous interest and has shown great benefits as a front-end for noise-robust ASR. However, the conventional neural network (NN) based T-F mask estimation approaches are only trained in parallel simulated speech corpus, which results in poor performance in the real data testing, where a data mismatch problem occurs. To make the NN-based mask estimation, termed as NN-mask, more robust against data mismatch problem, this paper proposes a bi-directional long short-term memory (BiLSTM) based teacher-student (T-S) learning scheme, termed as BiLSTM-TS, which can utilize the real data during student network training stage. Moreover, in order to further suppress the noise in the beamformed signal, we explore three different mask-based post-processing methods to find a better way to utilize the estimated masks from NN. The proposed approach is evaluated as a front-end for ASR on the CHiME-3 dataset. Experimental results show that the data mismatch problem can be reduced significantly by the proposed strategies, leading to relative 4% Word Error Rates (WER) reduction compared to conventional BiLSTM mask-based beamforming, in the real data test set.

Author supplied keywords

Cite

CITATION STYLE

APA

Liu, Z., Chen, Q., Hu, H., Tang, H., & Zou, Y. X. (2019). Teacher-student learning and post-processing for robust bilstm mask-based acoustic beamforming. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11955 LNCS, pp. 522–533). Springer. https://doi.org/10.1007/978-3-030-36718-3_44

Teacher-student learning and post-processing for robust bilstm mask-based acoustic beamforming

Abstract

Author supplied keywords

Cite

Register to see more suggestions