Three-stream action tubelet detector for spatiotemporal action detection in videos

Yutang Wu; Hanli Wang; Qinyu Li

Conference Proceedings

Three-stream action tubelet detector for spatiotemporal action detection in videos

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11165 LNCS 296-306

DOI: 10.1007/978-3-030-00767-6_28

1Citations

2Readers

Get full text

Abstract

In recent years, human action detection in videos has gained wide attention. Instead of detection frame by frame, a model named action tubelet (ACT) detector detects human actions sequence by sequence and achieves remarkable performances on both accuracy and speed in the form of two streams. In this work, a three-stream action tubelet detector (three-stream ACT detector) is proposed which adds an extra pose stream to obtain more information about human actions and fuses three streams by weighted average compared to the two-stream architecture. The experimental results on the benchmark UCF-Sports, J-HMDB and UCF-101 datasets demonstrate that the proposed three-stream ACT detector framework is able to boost the performance of human action detection.

Author supplied keywords

Cite

CITATION STYLE

APA

Wu, Y., Wang, H., & Li, Q. (2018). Three-stream action tubelet detector for spatiotemporal action detection in videos. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11165 LNCS, pp. 296–306). Springer Verlag. https://doi.org/10.1007/978-3-030-00767-6_28

Three-stream action tubelet detector for spatiotemporal action detection in videos

Abstract

Author supplied keywords

Cite

Register to see more suggestions