The impact of inaccurate phonetic annotations on speech recognition performance

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper focuses on impact of phonetic inaccuracies of acoustic training data on performance of automatic speech recognition system. This is especially important if the training data is created in automated way. In this case, the data often contains errors in a form of wrong phonetic transcriptions. A series of experiments simulating various common errors in phonetic transcriptions based on parts of GlobalPhone data set (for Croatian, Czech and Russian) is conducted. These experiments show the influence of various errors on different languages and acoustic models (Gaussian mixture models, deep neural networks). The impact of errors is also shown for real data obtained by our automated ASR creation process for Belarusian. The results show that the best performance is achieved by using the most accurate data; however, certain amount of errors (up to 5%) does have relatively small impact on speech recognition accuracy.

Cite

CITATION STYLE

APA

Safarik, R., & Mateju, L. (2017). The impact of inaccurate phonetic annotations on speech recognition performance. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10415 LNAI, pp. 402–410). Springer Verlag. https://doi.org/10.1007/978-3-319-64206-2_45

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free