Attention-based Multimodal Feature Fusion for Dance Motion Generation

Kosmas Kritsis; Aggelos Gkiokas; Aggelos Pikrakis; Vassilis Katsouros

Conference ProceedingsOPEN ACCESS

Attention-based Multimodal Feature Fusion for Dance Motion Generation

ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction (2021) 763-767

DOI: 10.1145/3462244.3479961

3Citations

6Readers

Get full text

Abstract

Recent advances in deep learning have enabled the extraction of high-level skeletal features from raw images and video sequences, paving the way for new possibilities in a variety of artificial intelligence tasks, including automatically synthesized human motion sequences. In this paper we present a system that combines 2D skeletal data and musical information to generate skeletal dancing sequences. The architecture is implemented solely with convolutional operations and trained by following a teacher-force supervised learning approach, while the synthesis of novel motion sequences follows an autoregressive process. Additionally, by employing an attention mechanism we fuse the latent representations of past music and motion information in order to condition the generation process. For assessing the system performance, we generated 900 sequences and evaluated the perceived realism, motion diversity and multimodality of the generated sequences based on various diversity metrics.

Author supplied keywords

Cite

CITATION STYLE

APA

Kritsis, K., Gkiokas, A., Pikrakis, A., & Katsouros, V. (2021). Attention-based Multimodal Feature Fusion for Dance Motion Generation. In ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction (pp. 763–767). Association for Computing Machinery, Inc. https://doi.org/10.1145/3462244.3479961

Attention-based Multimodal Feature Fusion for Dance Motion Generation

Abstract

Author supplied keywords

Cite

Register to see more suggestions