Improving Russian LVCSR Using Deep Neural Networks for Acoustic and Language Modeling

Irina Kipyatkova

Conference Proceedings

Improving Russian LVCSR Using Deep Neural Networks for Acoustic and Language Modeling

Kipyatkova I

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11096 LNAI 291-300

DOI: 10.1007/978-3-319-99579-3_31

6Citations

2Readers

Get full text

Abstract

In the paper, we present our very large vocabulary continuous Russian speech recognition system based on various neural networks. We employed neural networks on both acoustic and language modeling stages. For training hybrid acoustic models, we experimented with several types of neural networks: feedforward deep neural network, time-delay neural network, Long Short-Term Memory, bidirectional Long Short-Term Memory. We created neural networks with various numbers of hidden layers and units in hidden layers. Language modeling was performed using recurrent neural network. At first, experiments on Russian speech recognition were carried out using hybrid acoustic models and 3-gram language model. Then 500-best list was rescored with recurrent neural network language model. The lowest word error rate equal to 15.13% was achieved using time-delay neural network for acoustic modeling and recurrent neural network language model interpolated with 3-gram model for 500-best list rescoring.

Author supplied keywords

Cite

CITATION STYLE

APA

Kipyatkova, I. (2018). Improving Russian LVCSR Using Deep Neural Networks for Acoustic and Language Modeling. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11096 LNAI, pp. 291–300). Springer Verlag. https://doi.org/10.1007/978-3-319-99579-3_31

Improving Russian LVCSR Using Deep Neural Networks for Acoustic and Language Modeling

Abstract

Author supplied keywords

Cite

Register to see more suggestions