Tibetan multi-dialect speech and dialect identity recognition

12Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Tibetan language has very limited resource for conventional automatic speech recognition so far. It lacks of enough data, sub-word unit, lexicons and word inventories for some dialects. And speech content recognition and dialect classification have been treated as two independent tasks and modeled respectively in most prior works. But the two tasks are highly correlated. In this paper, we present a multi-task WaveNet model to perform simultaneous Tibetan multi-dialect speech recognition and dialect identification. It avoids processing the pronunciation dictionary and word segmentation for new dialects, while, in the meantime, allows training speech recognition and dialect identification in a single model. The experimental results show our method can simultaneously recognize speech content for different Tibetan dialects and identify the dialect with high accuracy using a unified model. The dialect information used in output for training can improve multi-dialect speech recognition accuracy, and the low-resource dialects got higher speech content recognition rate and dialect classification accuracy by multi-dialect and multi-task recognition model than task-specific models.

Cite

CITATION STYLE

APA

Zhao, Y., Yue, J., Song, W., Xu, X., Li, X., Wu, L., & Ji, Q. (2019). Tibetan multi-dialect speech and dialect identity recognition. Computers, Materials and Continua, 60(3), 1223–1235. https://doi.org/10.32604/cmc.2019.05636

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free