Semi-supervised Ladder Networks for Speech Emotion Recognition

Jian Hua Tao; Jian Huang; Ya Li; Zheng Lian; Ming Yue Niu

Journal ArticleOPEN ACCESS

Semi-supervised Ladder Networks for Speech Emotion Recognition

International Journal of Automation and Computing (2019) 16(4) 437-448

DOI: 10.1007/s11633-019-1175-x

34Citations

50Readers

Abstract

As a major component of speech signal processing, speech emotion recognition has become increasingly essential to understanding human communication. Benefitting from deep learning, many researchers have proposed various unsupervised models to extract effective emotional features and supervised models to train emotion recognition systems. In this paper, we utilize semi-supervised ladder networks for speech emotion recognition. The model is trained by minimizing the supervised loss and auxiliary unsupervised cost function. The addition of the unsupervised auxiliary task provides powerful discriminative representations of the input features, and is also regarded as the regularization of the emotional supervised task. We also compare the ladder network with other classical autoencoder structures. The experiments were conducted on the interactive emotional dyadic motion capture (IEMOCAP) database, and the results reveal that the proposed methods achieve superior performance with a small number of labelled data and achieves better performance than other methods.

Author supplied keywords

Cite

CITATION STYLE

APA

Tao, J. H., Huang, J., Li, Y., Lian, Z., & Niu, M. Y. (2019). Semi-supervised Ladder Networks for Speech Emotion Recognition. International Journal of Automation and Computing, 16(4), 437–448. https://doi.org/10.1007/s11633-019-1175-x

Semi-supervised Ladder Networks for Speech Emotion Recognition

Abstract

Author supplied keywords

Cite

Register to see more suggestions