Abstract
We present an approach to detecting and recognizing gestures in a stream of multi-modal data. Our approach combines a slidingwindow gesture detector with features drawn from skeleton data, color imagery, and depth data produced by a first-generation Kinect sensor. The detector consists of a set of one-versus-all boosted classifiers, each tuned to a specific gesture. Features are extracted at multiple temporal scales, and include descriptive statistics of normalized skeleton join positions, angles, and velocities, as well as image-based hand descriptors. The full set of gesture detectors may be trained in under two hours on a single machine, and is extremely efficient at runtime, operating at 1700fps using only skeletal data, or at 100fps using fused skeleton and image features. Our method achieved a Jaccard Index score of 0.834 on the ChaLearn-2014 Gesture Recognition Test dataset, and was ranked 2nd overall in the competition.
Author supplied keywords
Cite
CITATION STYLE
Monnier, C., German, S., & Ost, A. (2015). A multi-scale boosted detector for efficient and robust gesture recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8925, pp. 491–502). Springer Verlag. https://doi.org/10.1007/978-3-319-16178-5_34
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.