Investigating the corpus independence of the bag-of-audio-words approach

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we analyze the general use of the Bag-of-Audio-Words (BoAW) feature extraction method. This technique allows us to handle the problem of varying length recordings. The first step of the BoAW method is to define cluster centers (called codewords) over our feature set with an unsupervised training method (such as k-means clustering or even random sampling). This step is normally performed on the training set of the actual database, but this approach has its own drawbacks: we have to create new codewords for each data set and this increases the computing time and it can lead to over-fitting. Here, we analyse how much the codebook depends on the given corpus. In our experiments, we work with three databases: a Hungarian emotion database, a German emotion database and a general Hungarian speech database. We experiment with constructing a set of codewords on each of these databases, and examine how the classification accuracy scores vary on the Hungarian emotion database. According to our results, the classification performance was similar in each case, which suggests that the Bag-of-Audio-Words codebook is practically corpus-independent. This corpus-independence allows us to reuse codebooks created on different datasets, which can make it easier to use the BoAW method in practice.

Cite

CITATION STYLE

APA

Vetráb, M., & Gosztolya, G. (2020). Investigating the corpus independence of the bag-of-audio-words approach. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12284 LNAI, pp. 285–293). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-58323-1_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free