© 2015, Springer International Publishing Switzerland.With the advancement in technology and availability of multimedia content, human action recognition has become a major area of research in computer vision that contributes to semantic analysis of videos. The representation and matching of spatio-temporal information in videos is a major factor affecting the design and performance of existing convolution neural network approaches for human action recognition. In this paper, in contrast to the traditional approach of using raw video as input, we derive attributes from action bank features to represent and match spatio-temporal information effectively. The derived features are arranged in a square matrix and used as input to the convolutional neural network for action recognition. The effectiveness of the proposed approach is demonstrated on KTH and UCF Sports datasets.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below