Directional Temporal Modeling for Action Recognition

31Citations
Citations of this article
96Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Many current activity recognition models use 3D convolutional neural networks (e.g. I3D, I3D-NL) to generate local spatial-temporal features. However, such features do not encode clip-level ordered temporal information. In this paper, we introduce a channel independent directional convolution (CIDC) operation, which learns to model the temporal evolution among local features. By applying multiple CIDC units we construct a light-weight network that models the clip-level temporal evolution across multiple spatial scales. Our CIDC network can be attached to any activity recognition backbone network. We evaluate our method on four popular activity recognition datasets and consistently improve upon state-of-the-art techniques. We further visualize the activation map of our CIDC network and show that it is able to focus on more meaningful, action related parts of the frame.

Cite

CITATION STYLE

APA

Li, X., Shuai, B., & Tighe, J. (2020). Directional Temporal Modeling for Action Recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12351 LNCS, pp. 275–291). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-58539-6_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free