Deep neural network-based fusion model for emotion recognition using visual data

36Citations
Citations of this article
38Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In this study, we present a fusion model for emotion recognition based on visual data. The proposed model uses video information as its input and generates emotion labels for each video sample. Based on the video data, we first choose the most significant face regions with the use of a face detection and selection step. Subsequently, we employ three CNN-based architectures to extract the high-level features of the face image sequence. Furthermore, we adjusted one additional module for each CNN-based architecture to capture the sequential information of the entire video dataset. The combination of the three CNN-based models in a late-fusion-based approach yields a competitive result when compared to the baseline approach while using two public datasets: AFEW 2016 and SAVEE.

Cite

CITATION STYLE

APA

Do, L. N., Yang, H. J., Nguyen, H. D., Kim, S. H., Lee, G. S., & Na, I. S. (2021). Deep neural network-based fusion model for emotion recognition using visual data. Journal of Supercomputing, 77(10), 10773–10790. https://doi.org/10.1007/s11227-021-03690-y

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free