Towards Machine Learning-Based Emotion Recognition from Multimodal Data

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Understanding human emotion is vital to communicate effectively with others, monitor patients, analyse behaviour, and keep an eye on those who are vulnerable. Emotion recognition is essential to achieve a complete human-machine interoperability experience. Artificial intelligence, mainly machine learning (ML), have been used in recent years to improve the model for recognising emotions from a single type of data. A multimodal system has been proposed that uses text, facial expressions, and speech signals to identify emotions in this work. The MobileNet architecture is used to predict emotion from facial expressions, and different ML classifiers are used to predict emotion from text and speech signals in the proposed model. The Facial Expression Recognition 2013 (FER2013) dataset has been used to recognise emotion from facial expressions, whilst the Interactive Emotional Dyadic Motion Capture (IEMOCAP) dataset was used for both text and speech emotion recognition. The proposed ensemble technique consisting of random forest, extreme gradient boosting, and multi-layer perceptron achieves an accuracy of 70.67%, which is better than the unimodal approaches used.

Cite

CITATION STYLE

APA

Shahriar, M. F., Arnab, M. S. A., Khan, M. S., Rahman, S. S., Mahmud, M., & Kaiser, M. S. (2023). Towards Machine Learning-Based Emotion Recognition from Multimodal Data. In Lecture Notes in Networks and Systems (Vol. 519 LNNS, pp. 99–109). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-19-5191-6_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free