Dynamic Temporal Pyramid Network: A Closer Look at Multi-scale Modeling for Activity Detection

Da Zhang; Xiyang Dai; Yuan Fang Wang

Conference Proceedings

Dynamic Temporal Pyramid Network: A Closer Look at Multi-scale Modeling for Activity Detection

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11364 LNCS 712-728

DOI: 10.1007/978-3-030-20870-7_44

14Citations

36Readers

Get full text

Abstract

Recognizing instances at varying scales simultaneously is a fundamental challenge in visual detection problems. While spatial multi-scale modeling has been well studied in object detection, how to effectively apply a multi-scale architecture to temporal models for activity detection is still under-explored. In this paper, we identify three unique challenges that need to be specifically handled for temporal activity detection. To address all these issues, we propose Dynamic Temporal Pyramid Network (DTPN), a new activity detection framework with a multi-scale pyramidal architecture featuring three novel designs: (1) We sample frame sequence dynamically with different frame per seconds (FPS) to construct a natural pyramidal representation for arbitrary-length input videos. (2) We design a two-branch multi-scale temporal feature hierarchy to deal with the inherent temporal scale variation of activity instances. (3) We further exploit the temporal context of activities by appropriately fusing multi-scale feature maps, and demonstrate that both local and global temporal contexts are important. By combining all these components into a uniform network, we end up with a single-shot activity detector involving single-pass inferencing and end-to-end training. Extensive experiments show that the proposed DTPN achieves state-of-the-art performance on the challenging ActvityNet dataset.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhang, D., Dai, X., & Wang, Y. F. (2019). Dynamic Temporal Pyramid Network: A Closer Look at Multi-scale Modeling for Activity Detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11364 LNCS, pp. 712–728). Springer Verlag. https://doi.org/10.1007/978-3-030-20870-7_44

Dynamic Temporal Pyramid Network: A Closer Look at Multi-scale Modeling for Activity Detection

Abstract

Author supplied keywords

Cite

Register to see more suggestions