The impact of inaccurate phonetic annotations on speech recognition performance

Radek Safarik; Lukas Mateju

Conference Proceedings

The impact of inaccurate phonetic annotations on speech recognition performance

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10415 LNAI 402-410

DOI: 10.1007/978-3-319-64206-2_45

0Citations

2Readers

Get full text

Abstract

This paper focuses on impact of phonetic inaccuracies of acoustic training data on performance of automatic speech recognition system. This is especially important if the training data is created in automated way. In this case, the data often contains errors in a form of wrong phonetic transcriptions. A series of experiments simulating various common errors in phonetic transcriptions based on parts of GlobalPhone data set (for Croatian, Czech and Russian) is conducted. These experiments show the influence of various errors on different languages and acoustic models (Gaussian mixture models, deep neural networks). The impact of errors is also shown for real data obtained by our automated ASR creation process for Belarusian. The results show that the best performance is achieved by using the most accurate data; however, certain amount of errors (up to 5%) does have relatively small impact on speech recognition accuracy.

Author supplied keywords

Cite

CITATION STYLE

APA

Safarik, R., & Mateju, L. (2017). The impact of inaccurate phonetic annotations on speech recognition performance. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10415 LNAI, pp. 402–410). Springer Verlag. https://doi.org/10.1007/978-3-319-64206-2_45

The impact of inaccurate phonetic annotations on speech recognition performance

Abstract

Author supplied keywords

Cite

Register to see more suggestions