Whence and Whither: The Automatic Recognition of Emotions in Speech (Invited Keynote)

  • Batliner A
N/ACitations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this talk, we first want to sketch the (short) history of the automatic recognition of emotions in speech: studies on the characteristics of emotions in speech were published as early as in the twenties and thirties of the last century; attempts to recognize them automatically began in the mid nineties, dealing with acted data which still are used often - too often if we consider the fact that drawing inferences from acted data onto realistic data is at least sub-optimal. In a second part, we present the necessary ‘basics’: the design of the scenario, the recordings, the manual processing (transliteration, annotation), etc. These basics are to some extent ‘generic’ — for instance, each speech database has to be transliterated orthographically somehow. Other ones are specific such as the principles and the guidelines for emotion annotation, and the basic choices between, for example, dimensional and categorical approaches. The pros and cons of different annotation approaches have been discussed widely; however, the unit of analysis (utterance, turn, sentence, etc.?) has not yet been dealt with often; thus we will discuss this topic in more detail. In a third part, we will present acoustic and linguistic features that have been used (or should be used) in this field, and touch on the topic of their different degree of relevance. Classification and necessary ingredients such as feature reduction and selection, choice of classifier, and assessment of classification performance, will be addressed in the fourth part. So far, we have been dealing with the ‘whence’ in our title, depicting the state-of-the-art; we will end up the talk with the ‘whither’ in the title — with promising applications and some speculations on dead end approaches.

Cite

CITATION STYLE

APA

Batliner, A. (2008). Whence and Whither: The Automatic Recognition of Emotions in Speech (Invited Keynote). In Perception in Multimodal Dialogue Systems (pp. 1–1). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-69369-7_1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free