With the development of deep learning technology, speech recognition based on deep neural networks has been continuously improved in recent years. However, the performance of minority language speech recognition still cannot compare with that on majority language whose data can be collected and transcribed easily relatively. Therefore, we attempt to work out an effective data sharing method cross different languages to improve the performance of minority language speech recognition. We proposed a speech attribute detector model under an end-to-end framework, and then we utilized the detector to extract features for minority language speech recognition. To the best of our knowledge, this is the first end-to-end model extracting distinctive features. We implemented our experiments on Tibetan and Mandarin. The results showed the significant improvements were achieved on Tibetan phoneme recognition via utilizing the Mandarin data.
CITATION STYLE
Fu, T., Gao, S., & Wu, X. (2018). Improving minority language speech recognition based on distinctive features. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11266 LNCS, pp. 411–420). Springer Verlag. https://doi.org/10.1007/978-3-030-02698-1_36
Mendeley helps you to discover research relevant for your work.