ViCo-MoCo-DL: Video Coding and Motion Compensation Solutions for Human Activity Recognition Using Deep Learning

5Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This paper proposes three novel feature extraction approaches for human activity recognition in videos. The proposed solutions are based on video coding concepts including motion compensations and coding based feature variables. We use these features with deep learning for model generation and classification, hence the ViCo-MoCo-DL abbreviation which stands for Video Coding and Motion Compensation with Deep Learning. These solutions are fused in terms of averaging their classification scores to predict the human activity in videos. In all proposed solutions, an input video is temporarily segmented into 12 non-overlapping segments of equal size. In the first and second solution each segment is converted into one component of an RGB image, thus resulting in 4 RGB images. The conversion happens in terms of motion capture using motion estimate, motion compensation and accumulating image prediction errors. Consequently, in the first solution, the 4 generated RGB images are tiled into one big image which is used to train a Convolutional Neural Network (CNN) network. In the second solution each generated RGB image is entered into a pre-trained CNN for feature extraction. The resultant FVs are arranged into a matrix and used for training a Long Short-Term Memory network (LSTM). In the third solution, a customized High Efficiency Video Coder (HEVC) is used to generate feature variables per frame. The resultant Feature Vectors (FVs) of 3 video segments are arranged into a matrix and numerically summarized into one FV, thus, each input video is represented by 4 FVs which are used to train another LSTM network. Experimental results on three well-known datasets show the superior classification results of the proposed fused solution over existing work.

References Powered by Scopus

Overview of the high efficiency video coding (HEVC) standard

7180Citations
N/AReaders
Get full text

HMDB: A large video database for human motion recognition

3147Citations
N/AReaders
Get full text

Action recognition with improved trajectories

3021Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Human action recognition using an optical flow-gated recurrent neural network

3Citations
N/AReaders
Get full text

HIGH-ACCURACY HUMAN MOTION RECOGNITION INDEPENDENT OF MOTION DIRECTION USING A SINGLE CAMERA

1Citations
N/AReaders
Get full text

Temporal Relations of Informative Frames in Action Recognition

1Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Shanableh, T. (2023). ViCo-MoCo-DL: Video Coding and Motion Compensation Solutions for Human Activity Recognition Using Deep Learning. IEEE Access, 11, 73971–73981. https://doi.org/10.1109/ACCESS.2023.3296252

Readers' Seniority

Tooltip

Lecturer / Post doc 1

33%

PhD / Post grad / Masters / Doc 1

33%

Researcher 1

33%

Readers' Discipline

Tooltip

Computer Science 2

100%

Article Metrics

Tooltip
Mentions
News Mentions: 1

Save time finding and organizing research with Mendeley

Sign up for free