Fusion of audio-visual information for integrated speech processing

14Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper describes the integration of audio and visual speech information for robust adaptive speech processing. Since both audio speech signals and visual face configurations are produced by the human speech organs, these two types of information are strongly correlated and sometimes complement each other. This paper describes two applications based on the relationship between the two types of information, that is, bimodal speech recognition robust to acoustic noise that integrates audio-visual information, and speaking face synthesis based on the correlation between audio and visual speech. © Springer-Verlag 2001.

Cite

CITATION STYLE

APA

Nakamura, S. (2001). Fusion of audio-visual information for integrated speech processing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2091 LNCS, pp. 127–143). Springer Verlag. https://doi.org/10.1007/3-540-45344-x_20

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free