Fusion of audio-visual information for integrated speech processing

Satoshi Nakamura

Conference Proceedings

Fusion of audio-visual information for integrated speech processing

Nakamura S

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2001) 2091 LNCS 127-143

DOI: 10.1007/3-540-45344-x_20

14Citations

3Readers

Get full text

Abstract

This paper describes the integration of audio and visual speech information for robust adaptive speech processing. Since both audio speech signals and visual face configurations are produced by the human speech organs, these two types of information are strongly correlated and sometimes complement each other. This paper describes two applications based on the relationship between the two types of information, that is, bimodal speech recognition robust to acoustic noise that integrates audio-visual information, and speaking face synthesis based on the correlation between audio and visual speech. © Springer-Verlag 2001.

Cite

CITATION STYLE

APA

Nakamura, S. (2001). Fusion of audio-visual information for integrated speech processing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2091 LNCS, pp. 127–143). Springer Verlag. https://doi.org/10.1007/3-540-45344-x_20

Fusion of audio-visual information for integrated speech processing

Abstract

Cite

Register to see more suggestions