Feature aggregation tree: Capture temporal motion information for action recognition in videos

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We propose a model named Feature Aggregation Tree to capture the temporal motion information in videos for action recognition. Feature Aggregation Tree constructs a logical motion sequence by considering the concrete semantics of features and mining feature combinations in a video. It will save different feature combinations and then use the bayesian model to calculate the conditional probabilities of frame-level features based on the previous features to aggregate features. It doesn’t matter about the length of the video. Compared with the existing feature aggregation methods that try to enhance the descriptive capacity of features, our model has the following advantages: (i) It considers the temporal motion information in a video, and predicts the conditional probability by using the bayesian model. (ii) It can deal with arbitrary length of the video, rather than uniform sampling or feature encoding. (iii) It is compact and efficient compared to other encoding methods, with significant results compared to baseline methods. Experiments on the UCF101 dataset and HMDB51 dataset demonstrate the effectiveness of our method.

Cite

CITATION STYLE

APA

Zhu, B. (2018). Feature aggregation tree: Capture temporal motion information for action recognition in videos. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11258 LNCS, pp. 316–327). Springer Verlag. https://doi.org/10.1007/978-3-030-03338-5_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free