HTK-based recognition of whispered speech

9Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents results on whispered speech recognition of isolated words with Whi-Spe database, in speaker dependent mode. Word recognition rate is calculated for all speakers, four train/test scenarios, three values of mixture components, with modeling of context independent monophones, context dependent triphones and whole words. As a feature vector, Mel Frequency Cepstral Coefficients was used. The HTK, toolkit for building Hidden Markov Models, was used to implement isolated word recognizer. The best obtained results in match scenarios showed nearly equal recognition rate of 99.86% in normal speech recognition, and 99.90% in whispered speech recognition. Specifically, in mismatch scenarios, the best achieved recognition rate was 64.80% for training on part of normally phonated speech and testing on whispered speech and, in the opposite case, with training on whispered speech, the normal speech recognition was 74.88%.

Cite

CITATION STYLE

APA

Galić, J., Jovičić, S. T., Grozdić, Đ., & Marković, B. (2014). HTK-based recognition of whispered speech. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8773, pp. 251–258). Springer Verlag. https://doi.org/10.1007/978-3-319-11581-8_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free