Adaptive local spatiotemporal features from RGB-D data for one-shot learning gesture recognition

6Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

Abstract

Noise and constant empirical motion constraints affect the extraction of distinctive spatiotemporal features from one or a few samples per gesture class. To tackle these problems, an adaptive local spatiotemporal feature (ALSTF) using fused RGB-D data is proposed. First, motion regions of interest (MRoIs) are adaptively extracted using grayscale and depth velocity variance information to greatly reduce the impact of noise. Then, corners are used as keypoints if their depth, and velocities of grayscale and of depth meet several adaptive local constraints in each MRoI. With further filtering of noise, an accurate and sufficient number of keypoints is obtained within the desired moving body parts (MBPs). Finally, four kinds of multiple descriptors are calculated and combined in extended gradient and motion spaces to represent the appearance and motion features of gestures. The experimental results on the ChaLearn gesture, CAD-60 and MSRDailyActivity3D datasets demonstrate that the proposed feature achieves higher performance compared with published state-of-the-art approaches under the one-shot learning setting and comparable accuracy under the leave-one-out cross validation.

Cite

CITATION STYLE

APA

Lin, J., Ruan, X., Yu, N., & Yang, Y. H. (2016). Adaptive local spatiotemporal features from RGB-D data for one-shot learning gesture recognition. Sensors (Switzerland), 16(12). https://doi.org/10.3390/s16122171

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free