Multi-scale deep learning for gesture detection and localization

138Citations
Citations of this article
185Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We present a method for gesture detection and localization based on multi-scale and multi-modal deep learning. Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at two temporal scales. Key to our technique is a training strategy which exploits i) careful initialization of individual modalities; and ii) gradual fusion of modalities from strongest to weakest cross-modality structure. We present experiments on the ChaLearn 2014 Looking at People Challenge gesture recognition track, in which we placed first out of 17 teams.

Cite

CITATION STYLE

APA

Neverova, N., Wolf, C., Taylor, G. W., & Nebout, F. (2015). Multi-scale deep learning for gesture detection and localization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8925, pp. 474–490). Springer Verlag. https://doi.org/10.1007/978-3-319-16178-5_33

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free