This study presents a widespread analysis of affective vocal expression classification systems. In this study, the Hilbert–Huang–Hurst coefficient (HHHC) vector is proposed as a non-linear vocal source feature to represent the emotional states according to their effects on the speech production mechanism. Affective states are highlighted by the empirical mode decomposition-based method, which exploits the non-stationarity of the acoustic variations. Hurst coefficients are then estimated from the decomposition modes to form the feature vector. Additionally, a vector of the index of non-stationarity (INS) is introduced as dynamic information to the HHHC. The proposed feature vector is evaluated in speech emotion classification experiments with three databases in German and English languages. Three state-of-the-art acoustic feature vectors are adopted as a baseline. The (Formula presented.) -integrated Gaussian mixture model ((Formula presented.) -GMM) is also introduced for the emotion representation and classification. Its performance is compared to competing for stochastic and machine learning classifiers. Results demonstrate that the HHHC leads to significant classification improvement when compared to the baseline acoustic feature vectors. Moreover, results also show that the (Formula presented.) -GMM outperforms the competing classification methods. Finally, the complementarity aspects of HHHC and INS are also evaluated for the GeMAPS and eGeMAPS feature sets.
CITATION STYLE
Vieira, V., Coelho, R., & de Assis, F. M. (2020). Hilbert–Huang–Hurst-based non-linear acoustic feature vector for emotion classification with stochastic models and learning systems. IET Signal Processing, 14(8), 522–532. https://doi.org/10.1049/iet-spr.2019.0383
Mendeley helps you to discover research relevant for your work.