Surveillance video summarisation is characterised by extracting video segments con-taining abnormal events from surveillance video footages. Accurate identification of abnormal events from surveillance footages is of paramount importance in surveillance video summarisation. Accordingly, the proposed framework builds an aggregated con-volutional recurrent model that can precisely detect the suspicious events in a surveillance footage, by employing a supervised learning which is found to yield better results compared with unsupervised counterparts. The preliminary stage in the model is a multilayer Convolutional Neural Network for frame-level feature extraction followed by stacked bidirectional Gated Recurrent Unit for sequence-level feature extraction and classification. Since the video clips used for training are not implicit to surveillance, a block-based approach for testing on surveillance videos is proposed. The results evalu-ated on two custom datasets, Streets and Campus, prove that the proposed model pro-duces remarkable results leveraging the properties of bidirectional GRU with supervised learning. Extensive experimental analysis on selection of optimum architecture is con-ducted which substantiates the significance of stacked bidirectional GRUs over unidirectional ones. Additionally, qualitative results ensure that summaries produced are concise, representative, complete, diverse and informative. Moreover, comparison of the performance of the proposed model with state of the art certainly proves the superiority of the proposed model.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Sreeja, M. U., & Kovoor, B. C. (2021). An aggregated deep convolutional recurrent model for event based surveillance video summarisation: A supervised approach. IET Computer Vision, 15(4), 297–311. https://doi.org/10.1049/cvi2.12044