Multi-scale deep learning for gesture detection and localization

Natalia Neverova; Christian Wolf; Graham W. Taylor; Florian Nebout

Conference ProceedingsOPEN ACCESS

Multi-scale deep learning for gesture detection and localization

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 8925 474-490

DOI: 10.1007/978-3-319-16178-5_33

138Citations

185Readers

Abstract

We present a method for gesture detection and localization based on multi-scale and multi-modal deep learning. Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at two temporal scales. Key to our technique is a training strategy which exploits i) careful initialization of individual modalities; and ii) gradual fusion of modalities from strongest to weakest cross-modality structure. We present experiments on the ChaLearn 2014 Looking at People Challenge gesture recognition track, in which we placed first out of 17 teams.

Author supplied keywords

Cite

CITATION STYLE

APA

Neverova, N., Wolf, C., Taylor, G. W., & Nebout, F. (2015). Multi-scale deep learning for gesture detection and localization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8925, pp. 474–490). Springer Verlag. https://doi.org/10.1007/978-3-319-16178-5_33

Multi-scale deep learning for gesture detection and localization

Abstract

Author supplied keywords

Cite

Register to see more suggestions