Real-Time Action Recognition Using Multi-level Action Descriptor and DNN

  • Jin C
  • Dung Do T
  • Liu M
  • et al.
N/ACitations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

This work presents a novel approach to the problem of real-time human action recognition in intelligent video surveillance. For more efficient and precise labeling of an action, this work proposes a multilevel action descriptor, which delivers complete information of human actions. The action descriptor consists of three levels: posture, locomotion, and gesture level; each of which corresponds to a different group of subactions describing a single human action, for example, smoking while walking. The proposed action recognition method is able to localize and recognize simultaneously the actions of multiple individuals using appearance-based temporal features with multiple convolutional neural networks (CNN). Although appearance cues have been successfully exploited for visual recognition problems, appearance, motion history, and their combined cues with multi-CNNs have not yet been explored. Additionally, the first systematic estimation of several hyperparameters for shape and motion history cues is investigated. The proposed approach achieves a mean average precision (mAP) of 73.2% in the frame-based evaluation over the newly collected large-scale ICVL video dataset. The action recognition model can run at around 25 frames per second, which is suitable for real-time surveillance applications.

Cite

CITATION STYLE

APA

Jin, C.-B., Dung Do, T., Liu, M., & Kim, H. (2019). Real-Time Action Recognition Using Multi-level Action Descriptor and DNN. In Intelligent Video Surveillance. IntechOpen. https://doi.org/10.5772/intechopen.76086

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free