Abstract
Extraction of emotion features is the key to emotion recognition from speech. Capsnet is an emerging neural network technology which gives better performance over convolution neural networks in feature extraction. This is the system which implement a speech emotion recognition system using Capsnet. With the gradual development of the new generation of man-machine interaction technology speech emotion recognition has attracted wide research attentions. In facing with the development trend of new technologies, speech interaction is go-ing to penetrate into thousands of households. Traditional machine learning method has achieved great progresses in speech emotion recognition. However, there are some problems: First, which features can reflect the differences between different emotions and the second, these artificially designed features rely highly on database and have low generalization ability. It takes long time to extract fea-tures from the speech. Deep learning can extract different layers of features from the original data through automatic learning. Capsule Network or Capsnet, is composed of a number of capsules in each layer as the name indicate. Each cap-sule is a group of neurons who work together to get a specific outcome for the capsule. Speech emotion recognition works based on the spectrogram constructed from the voice record. A spectrogram is the plot of the spectrum of frequencies of sound as they vary with time.
Author supplied keywords
Cite
CITATION STYLE
Sukanya, K. S., & Sunny, L. E. (2019). Speech emotion recognition using capsNet. International Journal of Innovative Technology and Exploring Engineering, 8(6), 33–36.
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.