Speech emotion recognition using capsNet

ISSN: 22783075
2Citations
Citations of this article
42Readers
Mendeley users who have this article in their library.

Abstract

Extraction of emotion features is the key to emotion recognition from speech. Capsnet is an emerging neural network technology which gives better performance over convolution neural networks in feature extraction. This is the system which implement a speech emotion recognition system using Capsnet. With the gradual development of the new generation of man-machine interaction technology speech emotion recognition has attracted wide research attentions. In facing with the development trend of new technologies, speech interaction is go-ing to penetrate into thousands of households. Traditional machine learning method has achieved great progresses in speech emotion recognition. However, there are some problems: First, which features can reflect the differences between different emotions and the second, these artificially designed features rely highly on database and have low generalization ability. It takes long time to extract fea-tures from the speech. Deep learning can extract different layers of features from the original data through automatic learning. Capsule Network or Capsnet, is composed of a number of capsules in each layer as the name indicate. Each cap-sule is a group of neurons who work together to get a specific outcome for the capsule. Speech emotion recognition works based on the spectrogram constructed from the voice record. A spectrogram is the plot of the spectrum of frequencies of sound as they vary with time.

Cite

CITATION STYLE

APA

Sukanya, K. S., & Sunny, L. E. (2019). Speech emotion recognition using capsNet. International Journal of Innovative Technology and Exploring Engineering, 8(6), 33–36.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free