Abstract
One of challenging tasks in the field of artificial intelligence is the human action recognition. In this paper, we propose a novel long-term temporal feature learning architecture for recognizing human action in video, named Pseudo Recurrent Residual Neural Networks (P-RRNNs), which exploits the recurrent architecture and composes each in different connection among units. Two-stream CNNs model (GoogLeNet) is employed for extracting local temporal and spatial features respectively. The local spatial and temporal features are then integrated into global long-term temporal features by using our proposed two-stream P-RRNNs. Finally, the Softmax layer fuses the outputs of two-stream P-RRNNs for action recognition. The experimental results on two standard databases UCF101 and HMDB51 demonstrate the outstanding performance of proposed method based on architectures for human action recognition.
Author supplied keywords
Cite
CITATION STYLE
Yu, S., Xie, L., Liu, L., & Xia, D. (2020). Learning Long-Term Temporal Features with Deep Neural Networks for Human Action Recognition. IEEE Access, 8, 1840–1850. https://doi.org/10.1109/ACCESS.2019.2962284
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.