Abstract
Whisper is a specific mode of speech characterized by turbulent airflow at the glottis level. Despite an increased effort in speech perception, the intelligibility of whisper in human communication is very high. An enormous acoustic mismatch between normally phonated (neutral) and whispered speech is the main reason why modern Automatic Speech Recognition (ASR) systems have significant drop of performances when applied to whisper. In this paper, we present an analysis in recognition of whisper using 2 machine-learning techniques: Hidden Markov Models (HMM) and Support Vector Machines (SVM). The experiments are conducted in both Speaker Dependent (SD) and Speaker Independent (SI) fashion for Whi-Spe speech database. The best neutral-trained whisper recognition accuracy in SD fashion (83.36%) is obtained in SVM framework. At the same time, HMM-based recognition gave the highest recognition accuracy in SI fashion (87.42%). The results in recognition of neutral speech are given as well.
Author supplied keywords
Cite
CITATION STYLE
Galić, J., Popović, B., & Pavlović, D. Š. (2018). Whispered speech recognition using hidden markov models and support vector machines. Acta Polytechnica Hungarica, 15(5), 11–29. https://doi.org/10.12700/APH.15.5.2018.5.2
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.