Exploiting speech-gesture correlation in multimodal interaction

Fang Chen; Eric H.C. Choi; Ning Wang

Conference ProceedingsOPEN ACCESS

Exploiting speech-gesture correlation in multimodal interaction

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007) 4552 LNCS(PART 3) 23-30

DOI: 10.1007/978-3-540-73110-8_3

0Citations

6Readers

Abstract

This paper introduces a study about deriving a set of quantitative relationships between speech and co-verbal gestures for improving multimodal input fusion. The initial phase of this study explores the prosodic features of two human communication modalities, speech and gestures, and investigates the nature of their temporal relationships. We have studied a corpus of natural monologues with respect to frequent deictic hand gesture strokes, and their concurrent speech prosody. The prosodic features from the speech signal have been co-analyzed with the visual signal to learn the correlation of the prominent spoken semantic units with the corresponding deictic gesture strokes. Subsequently, the extracted relationships can be used for disambiguating hand movements, correcting speech recognition errors, and improving input fusion for multimodal user interactions with computers. © Springer-Verlag Berlin Heidelberg 2007.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Chen, F., Choi, E. H. C., & Wang, N. (2007). Exploiting speech-gesture correlation in multimodal interaction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4552 LNCS, pp. 23–30). Springer Verlag. https://doi.org/10.1007/978-3-540-73110-8_3

Readers' Seniority

PhD / Post grad / Masters / Doc 3

60%

Researcher 2

40%

Readers' Discipline

Computer Science 3

60%

Arts and Humanities 1

20%

Engineering 1

20%

Exploiting speech-gesture correlation in multimodal interaction

Abstract

Author supplied keywords

References Powered by Scopus

Mutual disambiguation of recognition errors in a multimodal architecture

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline