Statistical methods for scene and event classification

Brian McFee

Book Chapter

Statistical methods for scene and event classification

McFee B

Springer International Publishing, (2017), 103-146

DOI: 10.1007/978-3-319-63450-0_5

3Citations

14Readers

Get full text

Abstract

This chapter surveys methods for pattern classification in audio data. Broadly speaking, these methods take as input some representation of audio, typically the raw waveform or a time-frequency spectrogram, and produce semantically meaningful classification of its contents. We begin with a brief overview of statistical modeling, supervised machine learning, and model validation. This is followed by a survey of discriminative models for binary and multi-class classification problems. Next, we provide an overview of generative probabilistic models, including both maximum likelihood and Bayesian parameter estimation. We focus specifically on Gaussian mixture models and hidden Markov models, and their application to audio and time-series data. We then describe modern deep learning architectures, including convolutional networks, different variants of recurrent neural networks, and hybrid models. Finally, we survey model-agnostic techniques for improving the stability of classifiers.

Author supplied keywords

Cite

CITATION STYLE

APA

McFee, B. (2017). Statistical methods for scene and event classification. In Computational Analysis of Sound Scenes and Events (pp. 103–146). Springer International Publishing. https://doi.org/10.1007/978-3-319-63450-0_5

Statistical methods for scene and event classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions