Taking turns in a conversation is a delicate interplay of various signals, which we as humans can easily decipher. Embodied conversational agents (ECAs) communicating with humans should leverage this ability for smooth and enjoyable conversations. Extensive research has analyzed human turn-taking cues, and attempts have been made to predict turn-taking based on observed cues. These cues vary from prosodic, semantic, and syntactic modulation over adapted gesture and gaze behavior to actively used respiration. However, when generating such behavior for social robots or ECAs, often only single modalities were considered, e.g., gazing. We strive to design a comprehensive system that produces cues for all non-verbal modalities: gestures, gaze, and breathing. The system provides valuable cues without requiring speech content adaptation. We evaluated our system in a VR-based user study with N = 32 participants executing two subsequent tasks. First, we asked them to listen to two ECAs taking turns in several conversations. Second, participants engaged in taking turns with one of the ECAs directly. We examined the system’s usability and the perceived social presence of the ECAs’ turn-taking behavior, both with respect to each individual non-verbal modality and their interplay. While we found effects of gesture manipulation in interactions with the ECAs, no effects on social presence were found.
CITATION STYLE
Ehret, J., Bönsch, A., Nossol, P., Ermert, C. A., Mohanathasan, C., Schlittmeier, S. J., … Kuhlen, T. W. (2023). Who’s next? Integrating Non-Verbal Turn-Taking Cues for Embodied Conversational Agents. In Proceedings of the 23rd ACM International Conference on Intelligent Virtual Agents, IVA 2023. Association for Computing Machinery, Inc. https://doi.org/10.1145/3570945.3607312
Mendeley helps you to discover research relevant for your work.