Quantized GAN for Complex Music Generation from Dance Videos

5Citations
Citations of this article
47Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates complex musical samples conditioned on dance videos. Our proposed framework takes dance video frames and human body motions as input, and learns to generate music samples that plausibly accompany the corresponding input. Unlike most existing conditional music generation works that generate specific types of mono-instrumental sounds using symbolic audio representations (e.g., MIDI), and that usually rely on pre-defined musical synthesizers, in this work we generate dance music in complex styles (e.g., pop, breaking, etc.) by employing a Vector Quantized (VQ) audio representation, and leverage both its generality and high abstraction capacity of its symbolic and continuous counterparts. By performing an extensive set of experiments on multiple datasets, and following a comprehensive evaluation protocol, we assess the generative qualities of our proposal against alternatives. The attained quantitative results, which measure the music consistency, beats correspondence, and music diversity, demonstrate the effectiveness of our proposed method. Last but not least, we curate a challenging dance-music dataset of in-the-wild TikTok videos, which we use to further demonstrate the efficacy of our approach in real-world applications – and which we hope to serve as a starting point for relevant future research. Dataset and code at https://github.com/L-YeZhu/D2M-GAN.

Cite

CITATION STYLE

APA

Zhu, Y., Olszewski, K., Wu, Y., Achlioptas, P., Chai, M., Yan, Y., & Tulyakov, S. (2022). Quantized GAN for Complex Music Generation from Dance Videos. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13697 LNCS, pp. 182–199). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-19836-6_11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free