Spatio-temporal activity detection and recognition in untrimmed surveillance videos

9Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This work presents a spatio-temporal activity detection and recognition framework for untrimmed surveillance videos consisting of a three-step pipeline: object detection, tracking, and activity recognition. The framework relies on the YOLO v4 architecture for object detection, Euclidean distance for tracking, while the activity recognizer uses a 3D Convolutional Deep learning architecture employing spatio-temporal boundaries and addressing it as multi-label classification. The evaluation experiments on the VIRAT dataset achieve accurate detections of the temporal boundaries and recognitions of activities in untrimmed videos, with better performance for the multi-label compared to the multi-class activity recognition.

Cite

CITATION STYLE

APA

Gkountakos, K., Touska, D., Ioannidis, K., Tsikrika, T., Vrochidis, S., & Kompatsiaris, I. (2021). Spatio-temporal activity detection and recognition in untrimmed surveillance videos. In ICMR 2021 - Proceedings of the 2021 International Conference on Multimedia Retrieval (pp. 451–455). Association for Computing Machinery, Inc. https://doi.org/10.1145/3460426.3463591

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free