Cross-corpus speech emotion recognition(SER) is a hot topic in emotion classification. Cross-corpus SER includes these four issues:feature selection, differences constraint, label regression and preservation of discriminative emotion features. Seldom literature can solve these four issues jointly in previous studies.In this work,we propose the transfer emotion-discriminative features subspace learning(TEDFSL) method.Acoustic features are extracted by the OpenSMILE in the source and target data. Then the extracted features are sent into CNN+BLSTM to learn higher-level global features and time series. The common low-dimensional subspace of the source data and target data is learned by Linear Discriminant analysis (LDA) to reduce the dimension and Maximum Mean Discrepancy (MMD) and Graph Embedding (GE) to constraint the differences between source data and target data. The common low- dimensional subspace is combined with the label regression matrix to learn the relationship between labels and features,after which the, DNN is selected as the final classifier to preserve emotion-discriminative features, emotion-aware center loss(lc) is added and extensive experiments are carried out on cross-corpus SER tasks and the results demonstrate that our proposed method is superior to state-of-art cross-corpus SER.
CITATION STYLE
Kexin, Z., & Yunxiang, L. (2023). Speech Emotion Recognition Based on Transfer Emotion-Discriminative Features Subspace Learning. IEEE Access, 11, 56336–56343. https://doi.org/10.1109/ACCESS.2023.3282982
Mendeley helps you to discover research relevant for your work.