Toward a rule-based synthesis of emotional speech on linguistic descriptions of perception

2Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper reports rules for morphing a voice to make it be perceived as containing various primitive features, for example, to make it sound more "bright" or "dark". In a previous work we proposed a three-layered model, which contains emotional speech, primitive features, and acoustic features, for the perception of emotional speech. By experiments and acoustic analysis, we built the relationships between the three layers and reported that such relationships are significant. Then, a bottom-up method was adopted in order to verify the relationships. That is, we morphed (resynthesized) a speech voice by composing acoustic features in the bottommost layer to produce a voice in which listeners could perceive a single or multiple primitive features, which could be further perceived as different categories of emotion. The intermediate results show that the relationships of the model built in previous work are valid. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Huang, C. F., & Akagi, M. (2005). Toward a rule-based synthesis of emotional speech on linguistic descriptions of perception. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3784 LNCS, pp. 366–373). Springer Verlag. https://doi.org/10.1007/11573548_47

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free