Action recognition using deep 3D CNNs with sequential feature aggregation and attention

Fazliddin Anvarov; Dae Ha Kim; Byung Cheol Song

Journal ArticleOPEN ACCESS

Action recognition using deep 3D CNNs with sequential feature aggregation and attention

Electronics (Switzerland) (2020) 9(1)

DOI: 10.3390/electronics9010147

18Citations

20Readers

Abstract

Action recognition is an active research field that aims to recognize human actions and intentions from a series of observations of human behavior and the environment. Unlike image-based action recognition mainly using a two-dimensional (2D) convolutional neural network (CNN), one of the difficulties in video-based action recognition is that video action behavior should be able to characterize both short-term small movements and long-term temporal appearance information. Previous methods aim at analyzing video action behavior only using a basic framework of 3D CNN. However, these approaches have a limitation on analyzing fast action movements or abruptly appearing objects because of the limited coverage of convolutional filter. In this paper, we propose the aggregation of squeeze-and-excitation (SE) and self-attention (SA) modules with 3D CNN to analyze both short and long-term temporal action behavior efficiently. We successfully implemented SE and SA modules to present a novel approach to video action recognition that builds upon the current state-of-the-art methods and demonstrates better performance with UCF-101 and HMDB51 datasets. For example, we get accuracies of 92.5% (16f-clip) and 95.6% (64f-clip) with the UCF-101 dataset, and 68.1% (16f-clip) and 74.1% (64f-clip) with HMDB51 for the ResNext-101 architecture in a 3D CNN.

Author supplied keywords

Cite

CITATION STYLE

APA

Anvarov, F., Kim, D. H., & Song, B. C. (2020). Action recognition using deep 3D CNNs with sequential feature aggregation and attention. Electronics (Switzerland), 9(1). https://doi.org/10.3390/electronics9010147

Action recognition using deep 3D CNNs with sequential feature aggregation and attention

Abstract

Author supplied keywords

Cite

Register to see more suggestions