Temporal Attention Neural Network for Video Understanding

1Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Deep learning based vision understanding algorithms have recently approached human-level performance in object recognition and image captioning. These performance evaluations are, however, limited to static data and these algorithms are also limited. Few limitations of these methods include their inability to selectively encode human behavior, movement of multiple objects and time-varying variations in the background. To address these limitations and to extend these algorithms for analyzing dynamic videos, we propose a temporal attention CNN-RNN network with motion saliency map. Our proposed model overcome scarcity of usable information in encoded data and efficiently integrate motion features by incorporating dynamic nature of information present in successive frames. We evaluate our proposed model over UCF101 public dataset and our experiments demonstrate that our proposed model successfully extract motion information for video understanding without any computationally intensive preprocessing.

Cite

CITATION STYLE

APA

Son, J., Jang, G. J., & Lee, M. (2017). Temporal Attention Neural Network for Video Understanding. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10635 LNCS, pp. 422–430). Springer Verlag. https://doi.org/10.1007/978-3-319-70096-0_44

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free