This paper investigates the contribution of formants and prosodic features like pitch and energy on automatic speech recognition system performance in mobile networks especially the GSMEFR (Global System for Mobile Enhanced Full Rate) codec.The front-end of the speech recognition system combines feature extracted by converting the quantized spectral information of speech coder, prosodic information and formant frequencies. The quantized spectral information is represented by the LPC (Linear Predictive Coding) coefficients, the LSF (Line Spectral Frequencies) coefficients, the approximation of the LSF's to the LPC Cepstral Coefficients (LPCC’s) that are the Pseudo Cepstral Coefficients (PCC) and the Pseudo-Cepstrum (PCEP) coefficients. The achieved speaker-independent speech recognition system is based on Continuous Hidden Markov Model (CHMMs) classifier. The obtained results show that the resulting multivariate feature vectors lead to a significant improvement of the speech recognition system performance in mobile environment, compared to speech coder bit-stream system alone.
CITATION STYLE
Bouchakour, L., & Debyeche, M. (2014). Prosodic features and formant contribution for speech recognition system over mobile network. In Advances in Intelligent Systems and Computing (Vol. 239, pp. 131–140). Springer Verlag. https://doi.org/10.1007/978-3-319-01854-6_14
Mendeley helps you to discover research relevant for your work.