Three-Stage Deep Learning Framework for Video Surveillance

10Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.

Abstract

The escalating use of security cameras has resulted in a surge in images requiring analysis, a task hindered by the inefficiency and error-prone nature of manual monitoring. In response, this study delves into the domain of anomaly detection in CCTV security footage, addressing challenges previously encountered in analyzing videos with complex or dynamic backgrounds and long sequences. We introduce a three-stage deep learning architecture designed to detect abnormalities in security camera videos. The first stage employs a pre-trained convolutional neural network to extract features from individual video frames. Subsequently, these features are transformed into time series data in the second stage, utilizing a blend of bidirectional long short-term memory and multi-head attention to analyze short-term frame relationships. The final stage leverages relative positional embeddings and a custom Transformer encoder to interpret long-range frame relationships and identify anomalies. Tested on various open datasets, particularly those with complex backgrounds and extended sequences, our method demonstrates enhanced accuracy and efficiency in video analysis. This approach not only improves current security camera analysis but also shows potential for diverse application settings, signifying a significant advancement in the evolution of security camera monitoring and analysis technologies.

Cite

CITATION STYLE

APA

Lee, J. W., & Kang, H. S. (2024). Three-Stage Deep Learning Framework for Video Surveillance. Applied Sciences (Switzerland), 14(1). https://doi.org/10.3390/app14010408

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free