Performance evaluation of bimodal Hindi speech recognition under adverse environment

2Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Designing of a robust Human-Computer Interaction (HCI) system is a challenging task,especially for automatic speech recognition (ASR) when working under unfriendly environment.This paper proposesan ASRsystem which uses bimodal information (i.e. Speech along with the visual input) resulting inimproved robustness. In thisresearch staticand dynamic (Δ) audio features are extracted using the Mel-Frequency Cepstral Coefficients (MFCC).The visual feature isextracted using Two-Dimensional Discrete Wavelet Transform (2D-DWT). Audio-video recognition is performed over different combination of visual feature using HMM (Hidden Markov Model) under clean and noisy environmental conditions.Aligarh Muslim University Audio Visual (AMUAV) Hindi database has been chosen as the baseline data. In addition, noisy speech signal performance is evaluated for different Signal to Noise Ratio (SNR: 30 dB to -20 dB). At last, addition of visual information to ASR is reported to increase the accuracy when working under smart assistive environment, i.e. for applications, which may not have the noise-free background condition.

Author supplied keywords

Cite

CITATION STYLE

APA

Upadhyaya, P., Farooq, O., Abidi, M. R., & Varshney, P. (2015). Performance evaluation of bimodal Hindi speech recognition under adverse environment. In Advances in Intelligent Systems and Computing (Vol. 328, pp. 347–355). Springer Verlag. https://doi.org/10.1007/978-3-319-12012-6_38

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free