Emotion recognition is important in human communication and to achieve a complete interaction between humans and machines. In medical applications, emotion recognition is used to assist the children with Autism Spectrum Disorder (ASD to improve their socio-emotional communication, helps doctors with diagnosis of diseases such as depression and dementia and also helps the caretakers of older patients to monitor their well-being. This paper discusses the application of feature level fusion of speech and facial expressions of different emotions such as neutral, happy, sad, angry, surprise, fearful and disgust. Also, to explore how best to build the deep learning networks to classify the emotions independently and jointly from these two modalities. VGG-model is utilized to extract features from facial images, and spectral features are extracted from speech signals. Further, feature level fusion technique is adopted to fuse the features extracted from the two modalities. Principal Component Analysis (PCA is implemented to choose the significant features. The proposed method achieved a maximum score of 90% on training set and 82% on validation set. The recognition rate in case of multimodal data improved greatly when compared to unimodal system. The multimodal system gave an improvement of 9% compared to the performance of the system based on speech. Thus, result shows that the proposed Multimodal Emotion Recognition (MER outperform the unimodal emotion recognition system.
CITATION STYLE
Sushma, S., Bobby, T. C., & Malathi, S. (2021). Emotion analysis using signal and image processing approach by implementing deep neural network. Biomedical Sciences Instrumentation, 57(2), 313–321. https://doi.org/10.34107/YHPN9422.04313
Mendeley helps you to discover research relevant for your work.