Toward natural gesture/speech control of a large display

Sanshzar Kettebekov; Rajeev Sharma

Conference ProceedingsOPEN ACCESS

Toward natural gesture/speech control of a large display

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2001) 2254 221-234

DOI: 10.1007/3-540-45348-2_20

32Citations

41Readers

Abstract

In recent years because of the advances in computer vision research, free hand gestures have been explored as means of human-computer interaction (HCI). Together with improved speech processing technology it is an important step toward natural multimodal HCI. However, inclusion of non-predefined continuous gestures into a multimodal framework is a challenging problem. In this paper, we propose a structured approach for studying patterns of multimodal language in the context of a 2D-display control. We consider systematic analysis of gestures from observable kinematical primitives to their semantics as pertinent to a linguistic structure. Proposed semantic classification of co-verbal gestures distinguishes six categories based on their spatio-temporal deixis. We discuss evolution of a computational framework for gesture and speech integration which was used to develop an interactive testbed (iMAP). The testbed enabled elicitation of adequate, non-sequential, multimodal patterns in a narrative mode of HCI. Conducted user studies illustrate significance of accounting for the temporal alignment of gesture and speech parts in semantic mapping. Furthermore, co-occurrence analysis of gesture/speech production suggests syntactic organization of gestures at the lexical level.

Cite

CITATION STYLE

APA

Kettebekov, S., & Sharma, R. (2001). Toward natural gesture/speech control of a large display. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2254, pp. 221–234). Springer Verlag. https://doi.org/10.1007/3-540-45348-2_20

Toward natural gesture/speech control of a large display

Abstract

Cite

Register to see more suggestions