The chapter describes a typical modern speech emotion recognition engine as can be used to enhance computer games' or other technical systems' emotional intelligence. Acquisition of human affect via the spoken content and its prosody and further acoustic features is highlighted. Features for both of these information streams are shortly discussed along chunking of the stream. Decision making with and without training data is presented, each. A particular focus is then laid on autonomous learning and adaptation methods as well as the required calculation of confidence measures. Practical aspects include the encoding of the information, distribution of the processing, and available toolkits. Benchmark performances are given by typical competitive challenges in the field.
CITATION STYLE
Schuller, B. (2016). Emotion Modelling via Speech Content and Prosody: In Computer Games and Elsewhere (pp. 85–102). https://doi.org/10.1007/978-3-319-41316-7_5
Mendeley helps you to discover research relevant for your work.