Abstract
Anomaly detection in videos is commonly referred to as the discrimination of events that do not conform to expected behaviors. Most existing methods formulate video anomaly detection as an outlier detection task and establish normal concept by minimizing reconstruction loss or prediction loss on training data. However, these methods performances suffer drops when they cannot guarantee either higher reconstruction errors for abnormal events or lower prediction errors for normal events. To avoid these problems, we introduce a novel contrastive representation learning task, Cluster Attention Contrast, to establish subcategories of normality as clusters. Specifically, we employ multi-parallel projection layers to project snippet-level video features into multiple discriminate feature spaces. Each of these feature spaces is corresponding to a cluster which captures distinct subcategory of normality, respectively. To acquire the reliable subcategories, we propose the Cluster Attention Module to draw thecluster attention representation of each snippet, then maximize the agreement of the representations from the same snippet under random data augmentations via momentum contrast. In this manner, we establish a robust normal concept without any prior assumptions on reconstruction errors or prediction errors. Experiments show our approach achieves state-of-the-art performance on benchmark datasets.
Author supplied keywords
Cite
CITATION STYLE
Wang, Z., Zou, Y., & Zhang, Z. (2020). Cluster Attention Contrast for Video Anomaly Detection. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 2463–2471). Association for Computing Machinery, Inc. https://doi.org/10.1145/3394171.3413529
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.