In this paper, we present a vision-based system that estimates the pose of users as well as the gestures they perform in real time. This system allow users to interact naturally with an application (virtual reality, gaming) or a robot. The main components of our system are a 3D upper-body tracker, which estimates human body pose in real-time from a stereo sensor and a gesture recognizer, which classifies output from temporal tracker into gesture classes. The main novelty of our system is the bag-of-features representation for temporal sequences. This representation, though simple, proves to be surprisingly powerful and able to implicitly learn sequence dynamics. Based on this representation, a multi-class classifier, treating the bag of features as the feature vector is applied to estimate the corresponding gesture class. We show with experiments performed on a HCI gesture dataset that our method performs better than state-of-the-art algorithms and has some nice generalization properties. Finally, we describe virtual and real world applications, in which our system was integrated for multimodal interaction. © 2009 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Demirdjian, D., & Varri, C. (2009). Recognizing gestures for virtual and real world interaction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5815 LNCS, pp. 1–10). https://doi.org/10.1007/978-3-642-04667-4_1
Mendeley helps you to discover research relevant for your work.