Pronunciation adaptation for disordered speech recognition using state-specific vectors of phone-cluster adaptive training

6Citations
Citations of this article
65Readers
Mendeley users who have this article in their library.

Abstract

Pronunciation variation is a major problem in disordered speech recognition. This paper focus on handling the pronunciation variations in dysarthric speech by forming speaker-specific lexicons. A novel approach is proposed for identifying mispronunciations made by each dysarthric speaker, using state-specific vector (SSV) of phone-cluster adaptive training (Phone-CAT) acoustic model. SSV is low-dimensional vector estimated for each tied-state where each element in a vector denotes the weight of a particular monophone. The SSV indicates the pronounced phone using its dominant weight. This property of SSV is exploited in adapting the pronunciation of a particular dysarthric speaker using speaker-specific lexicons. Experimental validation on Nemours database showed an average relative improvement of 9% across all the speakers compared to the system built with canonical lexicon.

Cite

CITATION STYLE

APA

Sriranjani, R., Umesh, S., & Reddy, M. R. (2015). Pronunciation adaptation for disordered speech recognition using state-specific vectors of phone-cluster adaptive training. In SLPAT 2015 - 6th Workshop on Speech and Language Processing for Assistive Technologies, Proceedings (pp. 72–78). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-5113

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free