Inflated 3D ConvNet context analysis for violence detection

42Citations
Citations of this article
23Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

According to the Wall Street Journal, one billion surveillance cameras will be deployed around the world by 2021. This amount of information can be hardly managed by humans. Using a Inflated 3D ConvNet as backbone, this paper introduces a novel automatic violence detection approach that outperforms state-of-the-art existing proposals. Most of those proposals consider a pre-processing step to only focus on some regions of interest in the scene, i.e., those actually containing a human subject. In this regard, this paper also reports the results of an extensive analysis on whether and how the context can affect or not the adopted classifier performance. The experiments show that context-free footage yields substantial deterioration of the classifier performance (2% to 5%) on publicly available datasets. However, they also demonstrate that performance stabilizes in context-free settings, no matter the level of context restriction applied. Finally, a cross-dataset experiment investigates the generalizability of results obtained in a single-collection experiment (same dataset used for training and testing) to cross-collection settings (different datasets used for training and testing).

Cite

CITATION STYLE

APA

Freire-Obregón, D., Barra, P., Castrillón-Santana, M., & Marsico, M. D. (2022). Inflated 3D ConvNet context analysis for violence detection. In Machine Vision and Applications (Vol. 33). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/s00138-021-01264-9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free