Auditory scene classification with deep belief network

7Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Effective modeling and analyzing of an auditory scene is crucial to many context-aware and content-based multimedia applications. In this paper, we explore the effectiveness of the multiple-layer generative deep neural network model in discovering the underlying higher level and highly non-linear probabilistic representations from acoustic data of the unstructured auditory scenes. We first create a more compact and representative description of the input audio clip by focusing on the salient regions of data and modeling their contextual correlations. Next, we exploit deep belief network (DBN) to unsupervisedly discover and generate the high-level descriptions of scene audio as the activations of units on higher hidden layers of the trained DBN model, which are finally classified to certain category of scene by either the discriminative output layer of DBN or a separate classifier like support vector machine (SVM). The experiment reveals the effectiveness of the proposed DBNbased classification approach for auditory scenes.

Cite

CITATION STYLE

APA

Su, F., & Xue, L. (2015). Auditory scene classification with deep belief network. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8935, pp. 348–359). Springer Verlag. https://doi.org/10.1007/978-3-319-14445-0_30

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free