Emotions are essential in daily human life. Our goal is to make a machine which can recognize the human emotional state, and which can intelli- gently respond to the needs of humans. Accomplishing this is very important to support human-computer interaction (HCI). The majority of existing work concen- trates on the classification of six basic emotions only. This research work proposes a bimodal emotion recognition system to be used for human emotion detection. In this method, we used facial landmarks combined with a Gaborfilter bank for the extraction of facial features. We also used the pyAudioAnalysis library for extrac- tion of speech features. Both facial and speech features are fused at feature-level and forwarded to the Deep belief network for the classification of eight basic emo- tions. Finally, we used the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) in the experiment, for both unimodal and bimodal emo- tion recognition. The our results show that the performance of bimodal emotion recognition using the deep belief network (DBN) that was tested on the RAVDESS database in the classification of eight basic emotions with facial and speech infor- mation achieved an overall accuracy rate of 97.92%, which is better than unimodal emotion recognition (facial or speech expression).
CITATION STYLE
Jaratrotkamjorn, A., & Choksuriwong, A. (2021). Bimodal emotion recognition using deep belief network. ECTI Transactions on Computer and Information Technology, 15(1), 73–81. https://doi.org/10.37936/ecti-cit.2021151.226446
Mendeley helps you to discover research relevant for your work.