Lifelog scene change detection using cascades of audio and video detectors

Katariina Mahkonen; Joni Kristian Kämäräinen; Tuomas Virtanen

Conference Proceedings

Lifelog scene change detection using cascades of audio and video detectors

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9010 434-444

DOI: 10.1007/978-3-319-16634-6_32

1Citations

10Readers

Get full text

Abstract

The advent of affordable wearable devices with a video camera has established the new form of social data, lifelogs, where lives of people are captured to video. Enormous amount of lifelog data and need for on-site processing demand new fast video processing methods. In this work, we experimentally investigate seven hours of lifelogs and point out novel findings: (1) audio cues are exceptionally strong for lifelog processing; (2) cascades of audio and video detectors improve accuracy and enable fast (super frame rate) processing speed. We first construct strong detectors using state-of-the-art audio and visual features: Mel-frequency cepstral coefficients (MFCC), colour (RGB) histograms, and local patch descriptors (SIFT). In the second stage, we construct a cascade of the trained detectors and optimise cascade parameters. Separating the detector and cascade optimisation stages simplify training and results to a fast and accurate processing pipeline.

Cite

CITATION STYLE

APA

Mahkonen, K., Kämäräinen, J. K., & Virtanen, T. (2015). Lifelog scene change detection using cascades of audio and video detectors. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9010, pp. 434–444). Springer Verlag. https://doi.org/10.1007/978-3-319-16634-6_32

Lifelog scene change detection using cascades of audio and video detectors

Abstract

Cite

Register to see more suggestions