Automatic discrimination of speech and music is an important tool in many multimedia applications. This article presents an evolutionary, fuzzy, rules-based speech/music discrimination approach for intelligent audio coding, which exploits only one simple feature, called Warped LPC-based Spectral Centroid (WLPC-SC). Comparison between WLPC-SC and the classical features proposed in the literature for audio classification is performed, aiming to assess the good discriminatory power of the proposed feature. The length of the vector for describing the proposed psychoacoustic-based feature is reduced to a few statistical values (mean, variance, and skewness), which are then transformed to a new feature space, applying linear discriminant analysis (LDA), with the aim of increasing the classification accuracy percentage. The classification task is performed applying a support vector machine (SVM) to the features in the transformed space. The final decision is made by a fuzzy expert system, which improves the accuracy rate provided by the SVM, taking into account the audio labels assigned by this classifier to past audio frames. The accuracy rate improvement due to the fuzzy expert system is also reported. Experimental results reveal that our speech/music discriminator is robust and fast, making it suitable for intelligent audio coding.
CITATION STYLE
Munoz-Exposito, J. E., Galan, S. G., Reyes, N. R., & Candeas, P. V. (2009). Speech/music discrimination based on warping transformation and fuzzy logic for intelligent audio coding. Applied Artificial Intelligence, 23(5), 427–442. https://doi.org/10.1080/08839510902872306
Mendeley helps you to discover research relevant for your work.