Aiming at automatic detection of non-linguistic sounds from vocalizations, we investigate the applicability of various subsets of audio features, which were formed on the basis of ranking the relevance and the individual quality of several audio features. Specifically, based on the ranking of the large set of audio descriptors, we performed selection of subsets and evaluated them on the non-linguistic sound recognition task. During the audio parameterization process, every input utterance is converted to a single feature vector, which consists of 207 parameters. Next, a subset of this feature vector is fed to a classification model, which aims at straight estimation of the unknown sound class. The experimental evaluation showed that the feature vector composed of the 50-best ranked parameters provides a good trade-off between computational demands and accuracy, and that the best accuracy, in terms of recognition accuracy, is observed for the 150-best subset. © 2014 Springer International Publishing.
CITATION STYLE
Theodorou, T., Mporas, I., & Fakotakis, N. (2014). Audio feature selection for recognition of non-linguistic vocalization sounds. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8445 LNCS, pp. 395–405). Springer Verlag. https://doi.org/10.1007/978-3-319-07064-3_32
Mendeley helps you to discover research relevant for your work.