Hybrid HMM/BLSTM-RNN for robust speech recognition

Yang Sun; Louis Ten Bosch; Lou Boves

Conference Proceedings

Hybrid HMM/BLSTM-RNN for robust speech recognition

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2010) 6231 LNAI 400-407

DOI: 10.1007/978-3-642-15760-8_51

7Citations

16Readers

Get full text

Abstract

The question how to integrate information from different sources in speech decoding is still only partially solved (layered architecture versus integrated search). We investigate the optimal integration of information from Artificial Neural Nets in a speech decoding scheme based on a Dynamic Bayesian Network for noise robust ASR. A HMM implemented by the DBN cooperates with a novel Recurrent Neural Network (BLSTM-RNN), which exploits long-range context information to predict a phoneme for each MFCC frame. When using the identity of the most likely phoneme as a direct observation, such a hybrid system has proved to improve noise robustness. In this paper, we use the complete BLSTM-RNN output which is presented to the DBN as Virtual Evidence. This allows the hybrid system to use information about all phoneme candidates, which was not possible in previous experiments. Our approach improved word accuracy on the Aurora 2 Corpus by 8%. © 2010 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Sun, Y., Ten Bosch, L., & Boves, L. (2010). Hybrid HMM/BLSTM-RNN for robust speech recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6231 LNAI, pp. 400–407). https://doi.org/10.1007/978-3-642-15760-8_51

Hybrid HMM/BLSTM-RNN for robust speech recognition

Abstract

Author supplied keywords

Cite

Register to see more suggestions