As speech synthesis technology develops more advanced paralinguistic capabilities, open questions emerge regarding how humans perceive the use of such vocal capabilities by robots. Perceptions of spoken interaction are complex and influenced by multiple factors including the linguistic content of a message, social context, perceived intelligence of the agent, and form factor of its embodiment. This paper shares results from a study that controlled for the above factors in order to investigate the effect on human listeners of a male synthetic voice with an expressive range. Participants were randomly assigned to three conditions, counterbalancing for gender and language background, in which how paralinguistic cues were applied was varied. As the voice became more expressive and appropriate for the context, observers were more likely to describe the communication as effective, but were less likely to refer to the unseen agent as a person. Possible effects of the listener gender and cultural-linguistic background are examined. Implications for future methodologies in this field are discussed.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below