Filled pause classification using energy-boosted mel-frequency cepstrum coefficients

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Filled pause is one type of disfluency, identified as the often occurred disfluency in spontaneous speech and known to affect Automatic Speech Recognition accuracy. The purpose of this study is to analyze the impact of boosting Mel-Frequency Cepstral Coefficients with energy feature in classifying filled pause. A total of 828 filled pauses comprising a mixture of 62 male and female speakers are classified into /mhm/, /aaa/ and /eer/. A back-propagation neural network using fusion of gradient descent with momentum and adaptive learning rate is used as the classifier. The results revealed that energy-boosted Mel-Frequency Cepstral Coefficients produced a higher accuracy rate of 77 % in classifying filled pauses. © 2014 Springer Science+Business Media Singapore.

Cite

CITATION STYLE

APA

Hamzah, R., Jamil, N., & Seman, N. (2014). Filled pause classification using energy-boosted mel-frequency cepstrum coefficients. In Lecture Notes in Electrical Engineering (Vol. 291 LNEE, pp. 311–320). Springer Verlag. https://doi.org/10.1007/978-981-4585-42-2_36

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free