Human gait recognition in a multicamera environment is a challenging task in biometrics because of the presence of the large pose and variations in illumination among different views. In this work, to address the problem of variations in view, we present a novel stacked autoencoder for learning discriminant view-invariant gait representations. Our autoencoder can efficiently and progressively translate skeleton joint coordinates from any arbitrary view to a common canonical view without requiring the prior estimation of the view angle or covariate type and without losing temporal information. Then, we construct a discriminative gait feature vector by fusing the encoded features with two other spatiotemporal gait features to feed into the main recurrent neural network. Experimental evaluations of the challenging CASIA A and CASIA B gait datasets demonstrate that the proposed approach outperformed other state-of-the-art methods on single-view gait recognition. In particular, the proposed method achieved 46.31% and 33.86% average correct class recognition on probe set ProbeBG and ProbeCL, respectively, of the CASIA B dataset while considering the view variation; this is 0.3% and 30.68% higher than previous best-performing methods. Furthermore, in cross-view recognition, our method shows better results over other state-of-the-art methods when the view-angle variation is large than 36°.
CITATION STYLE
Hasan, M. M., & Mustafa, H. A. (2021). Learning view-invariant features using stacked autoencoder for skeleton-based gait recognition. IET Computer Vision, 15(7), 527–545. https://doi.org/10.1049/cvi2.12050
Mendeley helps you to discover research relevant for your work.