Performance evaluation of bimodal Hindi speech recognition under adverse environment

Prashant Upadhyaya; Omar Farooq; M. R. Abidi; Priyanka Varshney

Conference Proceedings

Performance evaluation of bimodal Hindi speech recognition under adverse environment

Advances in Intelligent Systems and Computing (2015) 328 347-355

DOI: 10.1007/978-3-319-12012-6_38

2Citations

4Readers

Get full text

Abstract

Designing of a robust Human-Computer Interaction (HCI) system is a challenging task,especially for automatic speech recognition (ASR) when working under unfriendly environment.This paper proposesan ASRsystem which uses bimodal information (i.e. Speech along with the visual input) resulting inimproved robustness. In thisresearch staticand dynamic (Δ) audio features are extracted using the Mel-Frequency Cepstral Coefficients (MFCC).The visual feature isextracted using Two-Dimensional Discrete Wavelet Transform (2D-DWT). Audio-video recognition is performed over different combination of visual feature using HMM (Hidden Markov Model) under clean and noisy environmental conditions.Aligarh Muslim University Audio Visual (AMUAV) Hindi database has been chosen as the baseline data. In addition, noisy speech signal performance is evaluated for different Signal to Noise Ratio (SNR: 30 dB to -20 dB). At last, addition of visual information to ASR is reported to increase the accuracy when working under smart assistive environment, i.e. for applications, which may not have the noise-free background condition.

Author supplied keywords

Cite

CITATION STYLE

APA

Upadhyaya, P., Farooq, O., Abidi, M. R., & Varshney, P. (2015). Performance evaluation of bimodal Hindi speech recognition under adverse environment. In Advances in Intelligent Systems and Computing (Vol. 328, pp. 347–355). Springer Verlag. https://doi.org/10.1007/978-3-319-12012-6_38

Performance evaluation of bimodal Hindi speech recognition under adverse environment

Abstract

Author supplied keywords

Cite

Register to see more suggestions