Violent scene detection using convolutional neural networks and deep audio features

Guankun Mu; Haibing Cao; Qin Jin

Conference Proceedings

Violent scene detection using convolutional neural networks and deep audio features

Communications in Computer and Information Science (2016) 663 451-461

DOI: 10.1007/978-981-10-3005-5_37

34Citations

24Readers

Get full text

Abstract

Violent scene detection (VSD) in videos has practical significance in various applications, such as film rating and child protection against violent behavior. Most of previous VSD systems have mainly used visual cues in the video although acoustic or audio cues can also help to detect violent scenes especially when visual cues are not reliable. In this paper, we focus on exploring acoustic information for violent scene detection. Convolutional Neural Networks (CNNs) have achieved the state-of-the-art performance in visual content processing tasks. We therefore investigate using CNNs for violent scene detection based on acoustic information in videos. We apply CNNs in two ways: as a classifier directly or as a deep acoustic feature extractor. Experimental results on the MediaEval 2015 evaluation dataset show that CNNs are effective both as classifiers and as acoustic feature extractors. Furthermore, fusion of acoustic and visual information significantly improves violent scene detection performance.

Author supplied keywords

Cite

CITATION STYLE

APA

Mu, G., Cao, H., & Jin, Q. (2016). Violent scene detection using convolutional neural networks and deep audio features. In Communications in Computer and Information Science (Vol. 663, pp. 451–461). Springer Verlag. https://doi.org/10.1007/978-981-10-3005-5_37

Violent scene detection using convolutional neural networks and deep audio features

Abstract

Author supplied keywords

Cite

Register to see more suggestions