The Effect of Narrow-Band Transmission on Recognition of Paralinguistic Information from Human Vocalizations

6Citations
Citations of this article
46Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Practically, no knowledge exists on the effects of speech coding and recognition for narrow-band transmission of speech signals within certain frequency ranges especially in relation to the recognition of paralinguistic cues in speech. We thus investigated the impact of narrow-band standard speech coders on the machine-based classification of affective vocalizations and clinical vocal recordings. In addition, we analyzed the effect of speech low-pass filtering by a set of different cut-off frequencies, either chosen as static values in the 0.5-5-kHz range or given dynamically by different upper limits from the first five speech formants (F1-F5). Speech coding and recognition were tested, first, according to short-term speaker states by using affective vocalizations as given by the Geneva Multimodal Emotion Portrayals. Second, in relation to long-term speaker traits, we tested vocal recording from clinical populations involving speech impairments as found in the Child Pathological Speech Database. We employ a large acoustic feature space derived from the Interspeech Computational Paralinguistics Challenge. Besides analysis of the sheer corruption outcome, we analyzed the potential of matched and multicondition training as opposed to miss-matched condition. In the results, first, multicondition and matched-condition training significantly increase performances as opposed to mismatched condition. Second, downgrades in classification accuracy occur, however, only at comparably severe levels of low-pass filtering. The downgrades especially appear for multi-categorical rather than for binary decisions. These can be dealt with reasonably by the alluded strategies.

Cite

CITATION STYLE

APA

Fruhholz, S., Marchi, E., & Schuller, B. (2016). The Effect of Narrow-Band Transmission on Recognition of Paralinguistic Information from Human Vocalizations. IEEE Access, 4, 6059–6072. https://doi.org/10.1109/ACCESS.2016.2604038

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free