Temporal Attention Neural Network for Video Understanding

Jegyung Son; Gil Jin Jang; Minho Lee

Conference Proceedings

Temporal Attention Neural Network for Video Understanding

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10635 LNCS 422-430

DOI: 10.1007/978-3-319-70096-0_44

1Citations

4Readers

Get full text

Abstract

Deep learning based vision understanding algorithms have recently approached human-level performance in object recognition and image captioning. These performance evaluations are, however, limited to static data and these algorithms are also limited. Few limitations of these methods include their inability to selectively encode human behavior, movement of multiple objects and time-varying variations in the background. To address these limitations and to extend these algorithms for analyzing dynamic videos, we propose a temporal attention CNN-RNN network with motion saliency map. Our proposed model overcome scarcity of usable information in encoded data and efficiently integrate motion features by incorporating dynamic nature of information present in successive frames. We evaluate our proposed model over UCF101 public dataset and our experiments demonstrate that our proposed model successfully extract motion information for video understanding without any computationally intensive preprocessing.

Author supplied keywords

Cite

CITATION STYLE

APA

Son, J., Jang, G. J., & Lee, M. (2017). Temporal Attention Neural Network for Video Understanding. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10635 LNCS, pp. 422–430). Springer Verlag. https://doi.org/10.1007/978-3-319-70096-0_44

Temporal Attention Neural Network for Video Understanding

Abstract

Author supplied keywords

Cite

Register to see more suggestions