In this paper we present the latest improvements to the Russian spontaneous speech recognition system developed in Speech Technology Center (STC). Significant word error rate (WER) reduction was obtained by applying hypothesis rescoring with sophisticated language models. These were the Recurrent Neural Network Language Model and regularized Long-Short Term Memory Language Model. For acoustic modeling we used the deep neural network (DNN) trained with speaker-dependent bottleneck features, similar to our previous system. This DNN was combined with the deep Bidirectional Long Short-Term Memory acoustic model by the use of score fusion. The resulting system achieves WER of 16.4%, with an absolute reduction of 8.7% and relative reduction of 34.7% compared to our previous system result on this test set.
CITATION STYLE
Medennikov, I., & Prudnikov, A. (2016). Advances in STC Russian spontaneous speech recognition system. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9811 LNCS, pp. 116–123). Springer Verlag. https://doi.org/10.1007/978-3-319-43958-7_13
Mendeley helps you to discover research relevant for your work.