Human Action Recognition for Dynamic Scenes of Emergency Rescue Based on Spatial-Temporal Fusion Network

13Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

Targeting the problems of the insufficient utilization of temporal and spatial information in videos and a lower accuracy rate, this paper proposes a human action recognition method for dynamic videos of emergency rescue based on a spatial-temporal fusion network. A time domain segmentation strategy based on random sampling maintains the overall time domain structure of the video. Considering the spatial-temporal asynchronous relationship, multiple asynchronous motion sequences are increased as input of the temporal convolutional network. spatial-temporal features are fused in convolutional layers to reduce feature loss. Because time series information is crucial for human action recognition, the acquired mid-layer spatial-temporal fusion features are sent into Bidirectional Long Short-Term Memory (Bi-LSTM) to obtain the human movement features in the whole video temporal dimension. Experiment results show the proposed method fully fuses spatial and temporal dimension information and improves the accuracy of human action recognition in dynamic scenes. It is also faster than traditional methods.

Cite

CITATION STYLE

APA

Zhang, Y., Guo, Q., Du, Z., & Wu, A. (2023). Human Action Recognition for Dynamic Scenes of Emergency Rescue Based on Spatial-Temporal Fusion Network. Electronics (Switzerland), 12(3). https://doi.org/10.3390/electronics12030538

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free