Improving Mental Health Through Multimodal Emotion Detection from Speech and Text Data Using Long-Short Term Memory

1Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In today’s world of cut-throat competition, where everyone is running an invisible race, we often find ourselves alone amongst the crowd. The advancements in technology are making our lives easier, yet man being a social animal is losing touch with society. As a result, today a huge part of the population is suffering from psychological disorders. Inferiority complex, inability to fulfil dreams, loneliness, etc., are considered to be the common reasons to disturb mental stability, which may further lead to disorders like depression. In extreme cases, depression causes loss of precious lives when an individual decides to commit suicide. Assessing an individual’s mental health in an interactive way with the core help of machine learning is the primary focus of this work. To realize this objective, we have used the most suitable long-short term memory (LSTM) architecture. It is an artificial recurrent neural network (RNN) in the field of deep learning on Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) and FastText datasets to get 86% accuracy when fed with model-patient conversational data. Further, we discussed the scope of enhancing cognitive control capabilities over the psychiatric disorders, which may even lead to severe level of depression and suicidal attacks. Here, the proposed system will help to determine the severity level of depression in a person and will help with the recovery process. The system comprises of a wrist-band to measure some biological parameters, a headband to analyse the mental health and a user-friendly website and mobile application which has an in-built chatbot. AI-based chatbot will talk to the patients and help them reveal their thoughts, which they are otherwise not able to communicate to their peers. A person can chat via text message, which is to be stored in the database for further analysis. The novelty of this work is in the sentiment analysis of voice chat, which therefore creates a comfortable environment for the user.

Cite

CITATION STYLE

APA

Bhagat, D., Ray, A., Sarda, A., Dutta Roy, N., Mahmud, M., & De, D. (2023). Improving Mental Health Through Multimodal Emotion Detection from Speech and Text Data Using Long-Short Term Memory. In Lecture Notes in Networks and Systems (Vol. 519 LNNS, pp. 13–23). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-19-5191-6_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free