Statistical methods for scene and event classification

3Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This chapter surveys methods for pattern classification in audio data. Broadly speaking, these methods take as input some representation of audio, typically the raw waveform or a time-frequency spectrogram, and produce semantically meaningful classification of its contents. We begin with a brief overview of statistical modeling, supervised machine learning, and model validation. This is followed by a survey of discriminative models for binary and multi-class classification problems. Next, we provide an overview of generative probabilistic models, including both maximum likelihood and Bayesian parameter estimation. We focus specifically on Gaussian mixture models and hidden Markov models, and their application to audio and time-series data. We then describe modern deep learning architectures, including convolutional networks, different variants of recurrent neural networks, and hybrid models. Finally, we survey model-agnostic techniques for improving the stability of classifiers.

Cite

CITATION STYLE

APA

McFee, B. (2017). Statistical methods for scene and event classification. In Computational Analysis of Sound Scenes and Events (pp. 103–146). Springer International Publishing. https://doi.org/10.1007/978-3-319-63450-0_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free