Compact video description and representation for automated summarization of human activities

1Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A compact framework is presented for the description and representation of videos depicting human activities, with the goal of enabling automated large-volume video summarization for semantically meaningful key-frame extraction. The framework is structured around the concept of per-frame visual word histograms, using the popular Bag-of-Features approach. Three existing image descriptors (histogram, FMoD, SURF) and a novel one (LMoD), as well as a component of an existing state-of-the-art activity descriptor (Dense Trajectories), are adapted into the proposed framework and quantitatively compared against each other, as well as against the most common video summarization descriptor (global image histogram), using a publicly available annotated dataset and the most prevalent video summarization method, i.e., frame clustering. In all cases, several image modalities are exploited (luminance, hue, edges, optical flow magnitude) in order to simultaneously capture information about the depicted shapes, colors, lighting, textures and motions. The quantitative evaluation results indicate that one of the proposed descriptors clearly outperforms the competing approaches in the context of the presented framework.

Author supplied keywords

Cite

CITATION STYLE

APA

Mademlis, I., Tefas, A., Nikolaidis, N., & Pitas, I. (2017). Compact video description and representation for automated summarization of human activities. In Advances in Intelligent Systems and Computing (Vol. 529, pp. 18–28). Springer Verlag. https://doi.org/10.1007/978-3-319-47898-2_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free