Self-supervised Sparse Representation for Video Anomaly Detection

20Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Video anomaly detection (VAD) aims at localizing unexpected actions or activities in a video sequence. Existing mainstream VAD techniques are based on either the one-class formulation, which assumes all training data are normal, or weakly-supervised, which requires only video-level normal/anomaly labels. To establish a unified approach to solving the two VAD settings, we introduce a self-supervised sparse representation (S3R) framework that models the concept of anomaly at feature level by exploring the synergy between dictionary-based representation and self-supervised learning. With the learned dictionary, S3R facilitates two coupled modules, en-Normal and de-Normal, to reconstruct snippet-level features and filter out normal-event features. The self-supervised techniques also enable generating samples of pseudo normal/anomaly to train the anomaly detector. We demonstrate with extensive experiments that S3R achieves new state-of-the-art performances on popular benchmark datasets for both one-class and weakly-supervised VAD tasks. Our code is publicly available at https://github.com/louisYen/S3R.

Cite

CITATION STYLE

APA

Wu, J. C., Hsieh, H. Y., Chen, D. J., Fuh, C. S., & Liu, T. L. (2022). Self-supervised Sparse Representation for Video Anomaly Detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13673 LNCS, pp. 729–745). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-19778-9_42

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free