Objectively Quantifying Pediatric Psychiatric Severity Using Artificial Intelligence, Voice Recognition Technology, and Universal Emotions: Pilot Study for Artificial Intelligence-Enabled Innovation to Address Youth Mental Health Crisis

2Citations
Citations of this article
40Readers
Mendeley users who have this article in their library.

Abstract

Background: Providing Psychotherapy, particularly for youth, is a pressing challenge in the health care system. Traditional methods are resource-intensive, and there is a need for objective benchmarks to guide therapeutic interventions. Automated emotion detection from speech, using artificial intelligence, presents an emerging approach to address these challenges. Speech can carry vital information about emotional states, which can be used to improve mental health care services, especially when the person is suffering. Objective: This study aims to develop and evaluate automated methods for detecting the intensity of emotions (anger, fear, sadness, and happiness) in audio recordings of patients’ speech. We also demonstrate the viability of deploying the models. Our model was validated in a previous publication by Alemu et al with limited voice samples. This follow-up study used significantly more voice samples to validate the previous model. Methods: We used audio recordings of patients, specifically children with high adverse childhood experience (ACE) scores; the average ACE score was 5 or higher, at the highest risk for chronic disease and social or emotional problems; only 1 in 6 have a score of 4 or above. The patients’ structured voice sample was collected by reading a fixed script. In total, 4 highly trained therapists classified audio segments based on a scoring process of 4 emotions and their intensity levels for each of the 4 different emotions. We experimented with various preprocessing methods, including denoising, voice-activity detection, and diarization. Additionally, we explored various model architectures, including convolutional neural networks (CNNs) and transformers. We trained emotion-specific transformer-based models and a generalized CNN-based model to predict emotion intensities. Results: The emotion-specific transformer-based model achieved a test-set precision and recall of 86% and 79%, respectively, for binary emotional intensity classification (high or low). In contrast, the CNN-based model, generalized to predict the intensity of 4 different emotions, achieved test-set precision and recall of 83% for each. Conclusions: Automated emotion detection from patients’ speech using artificial intelligence models is found to be feasible, leading to a high level of accuracy. The transformer-based model exhibited better performance in emotion-specific detection, while the CNN-based model showed promise in generalized emotion detection. These models can serve as valuable decision-support tools for pediatricians and mental health providers to triage youth to appropriate levels of mental health care services.

References Powered by Scopus

Going deeper with convolutions

38202Citations
N/AReaders
Get full text

IEMOCAP: Interactive emotional dyadic motion capture database

2656Citations
N/AReaders
Get full text

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

1265Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Using augmented intelligence to improve long term outcomes

0Citations
N/AReaders
Get full text

Implementing And Analyzing The Advantages Of Voice Ai As Measurement-Based Care To Address Behavioral Health Treatment Disparities Among Youth In Economically Disadvantaged Communities

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Caulley, D., Alemu, Y., Burson, S., Bautista, E. C., Tadesse, G. A., Kottmyer, C., … Sezgin, E. (2023). Objectively Quantifying Pediatric Psychiatric Severity Using Artificial Intelligence, Voice Recognition Technology, and Universal Emotions: Pilot Study for Artificial Intelligence-Enabled Innovation to Address Youth Mental Health Crisis. JMIR Research Protocols, 12(1). https://doi.org/10.2196/51912

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 8

73%

Professor / Associate Prof. 2

18%

Researcher 1

9%

Readers' Discipline

Tooltip

Computer Science 3

38%

Engineering 3

38%

Social Sciences 1

13%

Medicine and Dentistry 1

13%

Article Metrics

Tooltip
Mentions
News Mentions: 75
Social Media
Shares, Likes & Comments: 4

Save time finding and organizing research with Mendeley

Sign up for free