Learning Attention-Enhanced Spatiotemporal Representation for Action Recognition

12Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Learning spatiotemporal features via 3D-CNN (3D Convolutional Neural Network) models has been regarded as an effective approach for action recognition. In this paper, we explore visual attention mechanism for video analysis and propose a novel 3D-CNN model, dubbed AE-I3D (Attention-Enhanced Inflated-3D Network), for learning attention-enhanced spatiotemporal representation. The contribution of our AE-I3D is threefold: First, we inflate soft attention in spatiotemporal scope for 3D videos, and adopt softmax to generate probability distribution of attentional features in a feedforward 3D-CNN architecture; Second, we devise an AE-Res (Attention-Enhanced Residual learning) module, which learns attention-enhanced features in a two-branch residual learning way, also the AE-Res module is lightweight and flexible, so that can be easily embedded into many 3D-CNN architectures; Finally, we embed multiple AE-Res modules into an I3D (Inflated-3D) network, yielding our AE-I3D model, which can be trained in an end-to-end, video-level manner. Different from previous attention networks, our method inflates residual attention from 2D image to 3D video for 3D attention residual learning to enhance spatiotemporal representation. We use RGB-only video data for evaluation on three benchmarks: UCF101, HMDB51, and Kinetics. The experimental results demonstrate that our AE-I3D is effective with competitive performance.

Cite

CITATION STYLE

APA

Shi, Z., Cao, L., Guan, C., Zheng, H., Gu, Z., Yu, Z., & Zheng, B. (2020). Learning Attention-Enhanced Spatiotemporal Representation for Action Recognition. IEEE Access, 8, 16785–16794. https://doi.org/10.1109/ACCESS.2020.2968024

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free