Automatic correction of i/y spelling in czech asr output

2Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper concentrates on the design and evaluation of the method that would be able to automatically correct the spelling of i/y in the Czech words at the output of the ASR decoder. After analysis of both the Czech grammar rules and the data, we have decided to deal only with the endings consisting of consonants b/f/l/m/p/s/v/z followed by i/y in both short and long forms. The correction is framed as the classification task where the word could belong to the “i” class, the “y” class or the “empty” class. Using the state-of-the-art Bidirectional Encoder Representations from Transformers (BERT) architecture, we were able to substantially improve the correctness of the i/y spelling both on the simulated and the real ASR output. Since the misspelling of i/y in the Czech texts is seen by the majority of native Czech speakers as a blatant error, the corrected output greatly improves the perceived quality of the ASR system.

Author supplied keywords

Cite

CITATION STYLE

APA

Švec, J., Lehečka, J., Šmídl, L., & Ircing, P. (2020). Automatic correction of i/y spelling in czech asr output. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12284 LNAI, pp. 321–330). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-58323-1_35

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free