Filled pause is one type of disfluency, identified as the often occurred disfluency in spontaneous speech and known to affect Automatic Speech Recognition accuracy. The purpose of this study is to analyze the impact of boosting Mel-Frequency Cepstral Coefficients with energy feature in classifying filled pause. A total of 828 filled pauses comprising a mixture of 62 male and female speakers are classified into /mhm/, /aaa/ and /eer/. A back-propagation neural network using fusion of gradient descent with momentum and adaptive learning rate is used as the classifier. The results revealed that energy-boosted Mel-Frequency Cepstral Coefficients produced a higher accuracy rate of 77 % in classifying filled pauses. © 2014 Springer Science+Business Media Singapore.
CITATION STYLE
Hamzah, R., Jamil, N., & Seman, N. (2014). Filled pause classification using energy-boosted mel-frequency cepstrum coefficients. In Lecture Notes in Electrical Engineering (Vol. 291 LNEE, pp. 311–320). Springer Verlag. https://doi.org/10.1007/978-981-4585-42-2_36
Mendeley helps you to discover research relevant for your work.