Modal odal odal odal Feature Feature Feature Feature E E E Extraction xtraction xtraction xtraction

E S Selvakumar; S Shanmuga Priya

Journal Article

Modal odal odal odal Feature Feature Feature Feature E E E Extraction xtraction xtraction xtraction

Selvakumar E
Shanmuga Priya S

IJCSN International Journal of Computer Science and Network (2013) 2(2)

ISSN: 2277-5420

N/ACitations

1Readers

Abstract

Automatic Speech Recognition (ASR) is an essential component in many Human-Computer Interaction systems. A variety of applications in the field of ASR have reached high performance levels but only for condition-controlled environments. In this project, we reduce the noise in the video lectures using bi-modal feature extraction. Audio signal features need to be enhanced with additional sources of complementary information to overcome problems due to large amounts of acoustic noise. Visual Information extracted from speaker's mouth region seems to be promising and appropriate for giving audio-only recognition a boost. Lip/Mouth detection and tracking combined with traditional Image Processing methods may offer a variety of solutions for the construction of the visual front-end schema. Furthermore, Audio and Visual stream fusion appears to be even more challenging and crucial for designing an efficient AV Recognizer. In this project, we investigate some problems in the field of Audio-Visual Automatic Speech Recognition (AV-ASR) concerning visual feature extraction and audio-visual integration to reduce noise in the video lectures.

Author supplied keywords

Cite

CITATION STYLE

APA

Selvakumar, E. S., & Shanmuga Priya, S. (2013). Modal odal odal odal Feature Feature Feature Feature E E E Extraction xtraction xtraction xtraction. IJCSN International Journal of Computer Science and Network, 2(2).

Modal odal odal odal Feature Feature Feature Feature E E E Extraction xtraction xtraction xtraction

Abstract

Author supplied keywords

Cite

Register to see more suggestions