Analysis of Machine Learning Algorithms for Violence Detection in Audio

2Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Violence has always been part of humanity, however, there are different types of violence, with physical violence being the most recurrent in our daily lives. This type of violence increasingly affects many people’s lives, so it is essential to try to combat violence. In recent years, human action recognition has been extensively studied, but mainly in video, an important computer vision area. Audio appears as a factor capable of circumventing these problems. Audio sensors can be omnidirectional, requiring less processing power and hardware and software performance when compared to the video. The audio can represent emotions. It is not affected by lighting or temperature problems, nor does it need to be at a favourable angle to capture the intended information. That said, audio is seen as the best way to recognize violence, applied with Machine Learning/Deep Learning/Transfer Learning techniques. In this paper we test a Convolutional Neural Network (CNN), a ResNet50, VGG16 and VGG19, in order to classify audios. Later we see that CNN obtains the best results, with a 92.44% accuracy in the test set. ResNet50 was the worst model used, obtaining an 86.34% accuracy. For the VGG models, both show a good potential but did not get better results than CNN.

Cite

CITATION STYLE

APA

Veloso, B., Durães, D., & Novais, P. (2022). Analysis of Machine Learning Algorithms for Violence Detection in Audio. In Communications in Computer and Information Science (Vol. 1678 CCIS, pp. 210–221). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-18697-4_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free