A discriminative and compact audio representation for event detection

Liping Jing; Bo Liu; Jaeyoung Choi; Adam Janin; Julia Bernd; Michael W. Mahoney; Gerald Friedland

Conference Proceedings

A discriminative and compact audio representation for event detection

MM 2016 - Proceedings of the 2016 ACM Multimedia Conference (2016) 57-61

DOI: 10.1145/2964284.2970377

3Citations

13Readers

Get full text

Abstract

This paper presents a novel two-phase method for audio representation: Discriminative and Compact Audio Representation (DCAR). In the first phase, each audio track is modeled using a Gaussian mixture model (GMM) that includes several components to capture the variability within that track. The second phase takes into account both global structure and local structure. In this phase, the components are rendered more discriminative and compact by formulating an optimization problem on Grassmannian manifolds, which we found represents the structure of audio effectively. Experimental results on the YLI-MED dataset show that the proposed DCAR representation consistently outperforms state-of-the-art audio representations: i-vector, mv-vector, and GMM.

Author supplied keywords

Cite

CITATION STYLE

APA

Jing, L., Liu, B., Choi, J., Janin, A., Bernd, J., Mahoney, M. W., & Friedland, G. (2016). A discriminative and compact audio representation for event detection. In MM 2016 - Proceedings of the 2016 ACM Multimedia Conference (pp. 57–61). Association for Computing Machinery, Inc. https://doi.org/10.1145/2964284.2970377

A discriminative and compact audio representation for event detection

Abstract

Author supplied keywords

Cite

Register to see more suggestions