Class n-gram models for very large vocabulary speech recognition of Finnish and Estonian

Matti Varjokallio; Mikko Kurimo; Sami Virpioja

Conference Proceedings

Class n-gram models for very large vocabulary speech recognition of Finnish and Estonian

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9918 LNCS 133-144

DOI: 10.1007/978-3-319-45925-7_11

4Citations

3Readers

Get full text

Abstract

We study class n-gram models for very large vocabulary speech recognition of Finnish and Estonian. The models are trained with vocabulary sizes of several millions of words using automatically derived classes. To evaluate the models on Finnish and an Estonian broadcast news speech recognition task, we modify Aalto University’s LVCSR decoder to operate with the class n-grams and very large vocabularies. Linear interpolation of a standard n-gram model and a class n-gram model provides relative perplexity improvements of 21.3% for Finnish and 12.8% for Estonian over the n-gram model. The relative improvements in word error rates are 5.5% for Finnish and 7.4% for Estonian. We also compare our word-based models to a state-of-the-art unlimited vocabulary recognizer utilizing subword n-gram models, and show that the very large vocabulary word-based models can perform equally well or better.

Author supplied keywords

Cite

CITATION STYLE

APA

Varjokallio, M., Kurimo, M., & Virpioja, S. (2016). Class n-gram models for very large vocabulary speech recognition of Finnish and Estonian. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9918 LNCS, pp. 133–144). Springer Verlag. https://doi.org/10.1007/978-3-319-45925-7_11

Class n-gram models for very large vocabulary speech recognition of Finnish and Estonian

Abstract

Author supplied keywords

Cite

Register to see more suggestions