The quantification of emotional states is an important step to understanding wellbeing. Time series data from multiple modalities such as physiological and motion sensor data have proven to be integral for measuring and quantifying emotions. Monitoring emotional trajectories over long periods of time inherits some critical limitations in relation to the size of the training data. This shortcoming may hinder the development of reliable and accurate machine learning models. To address this problem, this article proposes a framework to tackle the limitation in performing emotional state recognition: (1) encoding time series data into coloured images; (2) leveraging pre-trained object recognition models to apply a Transfer Learning (TL) approach using the images from step 1; (3) utilising a 1D Convolutional Neural Network (CNN) to perform emotion classification from physiological data; (4) concatenating the pre-trained TL model with the 1D CNN. We demonstrate that model performance when inferring real-worldwellbeing rated on a 5-point Likert scale can be enhanced using our framework, resulting in up to 98.5% accuracy, outperforming a conventional CNN by 4.5%. Subject-independent models using the same approach resulted in an average of 72.3% accuracy (SD 0.038). The proposed methodology helps improve performance and overcome problems with small training datasets.
CITATION STYLE
Woodward, K., Kanjo, E., & Tsanas, A. (2024). Combining Deep Learning with Signal-image Encoding for Multi-Modal MentalWellbeing Classification. ACM Transactions on Computing for Healthcare, 5(1). https://doi.org/10.1145/3631618
Mendeley helps you to discover research relevant for your work.