Abstract
In this work, we present an approach to understand the computational methods and decision-making involved in the identification of emotions in spontaneous speech. The selected task consists of Spanish TV debates, which entail a high level of complexity as well as additional subjectivity in the human perception-based annotation procedure. A simple convolutional neural model is proposed, and its behaviour is analysed to explain its decision-making. The proposed model slightly outperforms commonly used CNN architectures such as VGG16, while being much lighter. Internal layer-by-layer transformations of the input spectrogram are visualised and analysed. Finally, a class model visualisation is proposed as a simple interpretation approach whose usefulness is assessed in the work.
Author supplied keywords
Cite
CITATION STYLE
de Velasco, M., Justo, R., López Zorrilla, A., & Torres, M. I. (2023). Analysis of Deep Learning-Based Decision-Making in an Emotional Spontaneous Speech Task. Applied Sciences (Switzerland), 13(2). https://doi.org/10.3390/app13020980
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.