Abstract
Automatic emotion recognition from multimodal content has become an important and growing research field in human-computer interaction. Recent literature has used either audio or facial expression for emotion detection. However, emotion and body gestures are closely related to one another. This paper explores the effectiveness of using text, audio, facial expression and body gesture modalities of multimodal content and machine learning and deep learning based models for building more accurate and robust automatic multimodal emotion recognition systems. First, we get the best accuracy from the individual modalities. Then we use feature level fusion and ensemble based decision level fusion to combine multiple modalities to get better results. Proposed models were tested on IEMOCAP dataset and results show that proposed models with multiple modalities are more accurate compared to unimodal models in classifying emotions.
Author supplied keywords
Cite
CITATION STYLE
Huddar, M. G., Sannakki, S. S., & Rajpurohit, V. S. (2019). Multimodal emotion recognition using facial expressions, body gestures, speech, and text modalities. International Journal of Engineering and Advanced Technology, 8(5), 2453–2459.
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.