A content-adaptive analysis and representation framework for audio event discovery from "unscripted" multimedia

8Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We propose a content-adaptive analysis andrepresentation framework to discover events using audio featuresfrom "unscripted" multimedia such as sports and surveillance forsummarization. The proposed analysis framework performs aninlier/outlier-based temporal segmentation of the content. It ismotivated by the observation that "interesting" events inunscripted multimedia occur sparsely in a background of usual or"uninteresting" events. We treat the sequence of low/mid-levelfeatures extracted from the audio as a time series and identifysubsequences that are outliers. The outlier detection is based oneigenvector analysis of the affinity matrix constructed fromstatistical models estimated from the subsequences of the timeseries. We define the confidence measure on each of the detectedoutliers as the probability that it is an outlier. Then, weestablish a relationship between the parameters of the proposedframework and the confidence measure. Furthermore, we use theconfidence measure to rank the detected outliers in terms of theirdepartures from the background process. Our experimental resultswith sequences of low- and mid-level audio features extracted fromsports video show that "highlight" events can be extractedeffectively as outliers from a background process using theproposed framework. We proceed to show the effectiveness of theproposed framework in bringing out suspicious events fromsurveillance videos without any a priori knowledge. We show thatsuch temporal segmentation into background and outliers, alongwith the ranking based on the departure from the background, canbe used to generate content summaries of any desired length.Finally, we also show that the proposed framework can be used tosystematically select "key audio classes" that are indicative ofevents of interest in the chosen domain.

Cite

CITATION STYLE

APA

Radhakrishnan, R., Divakaran, A., Xiong, Z., & Otsuka, I. (2006). A content-adaptive analysis and representation framework for audio event discovery from “unscripted” multimedia. Eurasip Journal on Applied Signal Processing, 2006. https://doi.org/10.1155/ASP/2006/89013

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free