Typical manually-selected features are insufficient to reliably detect violence actions. In this paper, we present a violence detection model that is based on a bi-channels convolutional neural network (CNN) and the support vector machine (SVM). The major contributions are twofolds: (1) we fork the original frames and the differential images into the proposed bi-channels CNN to obtain the appearance features and the motion features respectively. (2) The linear SVMs are adopted to classify the features and a label fusion approach is proposed to improve detection performance by integrating the appearance and motion information. We compared the proposed model with several state-of-the-art methods on two datasets. The results are promising and the proposed method can achieve real-time performance of 30 fps.
CITATION STYLE
Xia, Q., Zhang, P., Wang, J. J., Tian, M., & Fei, C. (2018). Real time violence detection based on deep spatio-temporal features. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10996 LNCS, pp. 157–165). Springer Verlag. https://doi.org/10.1007/978-3-319-97909-0_17
Mendeley helps you to discover research relevant for your work.