Multi-Modal Anomaly Detection by Using Audio and Visual Cues

26Citations
Citations of this article
22Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This paper considers the problem of anomaly detection in an outdoor environment where surveillance cameras are usually installed to monitor activities of general public. A novel solution is proposed which combines audio and visual data to automatically detect abnormal activities. The proposed anomaly detection algorithm makes use of both visual and audio features to automatically detect anomalous activities in scenes. Visual features such as optical flow technique combined with particle swam optimization and social force model are used, whereas, acoustic features such as, energy, zero crossing rate, volume, spectral-centroid, spectral spread, spectral roll-off, spectral flux, cross correlation and the mel-frequency cepstral coefficients (MFCCs) are used. An anomaly inference is developed which is based on both visual and audio features. The performance of the proposed algorithm is evaluated by testing it on the publicly available UMN datasets combined with the audio recordings. The proposed algorithm is compared with state-of-the-art techniques and is shown to achieve improved performance in terms of accuracy.

Cite

CITATION STYLE

APA

Rehman, A. U., Ullah, H. S., Farooq, H., Khan, M. S., Mahmood, T., & Khan, H. O. A. (2021). Multi-Modal Anomaly Detection by Using Audio and Visual Cues. IEEE Access, 9, 30587–30603. https://doi.org/10.1109/ACCESS.2021.3059519

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free