An Adaptive Framework for Anomaly Detection in Time-Series Audio-Visual Data

10Citations
Citations of this article
34Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Anomaly detection is an integral part of a number of surveillance applications. However, most of the existing anomaly detection models are statically trained on pre-recorded data from a single source, thus making multiple assumptions about the surrounding environment. As a result, their usefulness is limited to controlled scenarios. In this paper, we fuse information from live streams of audio and video data to detect anomalies in the captured environment. We train a deep learning-based teacher-student network using video, image, and audio information. The pre-trained visual network in the teacher model distills its information to the image and audio networks in the student model. Features from image and audio networks are combined and compressed using principal component analysis. Thus, the teacher-student network produces an image-audio-based light-weight joint representation of the data. The data dynamics are learned in a multivariate adaptive Gaussian mixture model. Empirical results from two audio-visual datasets demonstrate the effectiveness of joint representation over single modalities in the adaptive anomaly detection framework. The proposed framework outperforms the state-of-the-art methods by an average of 15.00 % and 14.52 % in AUC values for dataset 1 and dataset 2, respectively.

Cite

CITATION STYLE

APA

Kumari, P., & Saini, M. (2022). An Adaptive Framework for Anomaly Detection in Time-Series Audio-Visual Data. IEEE Access, 10, 36188–36199. https://doi.org/10.1109/ACCESS.2022.3164439

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free