Lifelog scene change detection using cascades of audio and video detectors

1Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The advent of affordable wearable devices with a video camera has established the new form of social data, lifelogs, where lives of people are captured to video. Enormous amount of lifelog data and need for on-site processing demand new fast video processing methods. In this work, we experimentally investigate seven hours of lifelogs and point out novel findings: (1) audio cues are exceptionally strong for lifelog processing; (2) cascades of audio and video detectors improve accuracy and enable fast (super frame rate) processing speed. We first construct strong detectors using state-of-the-art audio and visual features: Mel-frequency cepstral coefficients (MFCC), colour (RGB) histograms, and local patch descriptors (SIFT). In the second stage, we construct a cascade of the trained detectors and optimise cascade parameters. Separating the detector and cascade optimisation stages simplify training and results to a fast and accurate processing pipeline.

Cite

CITATION STYLE

APA

Mahkonen, K., Kämäräinen, J. K., & Virtanen, T. (2015). Lifelog scene change detection using cascades of audio and video detectors. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9010, pp. 434–444). Springer Verlag. https://doi.org/10.1007/978-3-319-16634-6_32

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free