STEP: Spatial temporal graph convolutional networks for emotion perception from gaits

112Citations
Citations of this article
103Readers
Mendeley users who have this article in their library.

Abstract

We present a novel classifier network called STEP, to classify perceived human emotion from gaits, based on a Spatial Temporal Graph Convolutional Network (ST-GCN) architecture. Given an RGB video of an individual walking, our formulation implicitly exploits the gait features to classify the perceived emotion of the human into one of four emotions: happy, sad, angry, or neutral. We train STEP on annotated real-world gait videos, augmented with annotated synthetic gaits generated using a novel generative network called STEP-Gen, built on an ST-GCN based Conditional Variational Autoencoder (CVAE). We incorporate a novel push-pull regularization loss in the CVAE formulation of STEP-Gen to generate realistic gaits and improve the classification accuracy of STEP. We also release a novel dataset (E-Gait), which consists of 4,227 human gaits annotated with perceived emotions along with thousands of synthetic gaits. In practice, STEP can learn the affective features and exhibits classification accuracy of 88% on E-Gait, which is 14-30% more accurate over prior methods.

Cite

CITATION STYLE

APA

Bhattacharya, U., Mittal, T., Chandra, R., Randhavane, T., Bera, A., & Manocha, D. (2020). STEP: Spatial temporal graph convolutional networks for emotion perception from gaits. In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence (pp. 1342–1350). AAAI press. https://doi.org/10.1609/aaai.v34i02.5490

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free