Generative Adversarial Networks (GANs) are a widely used tool for generating highly realistic artificial data. As the output of these networks can show high diversity and novelty, GANs have the potential to be used as creative tools. However, using GANs in this context poses major challenges due to their unpredictability and lack of controllability, making it difficult for creative people to realize their artistic vision. To address this problem, we present two graphical user interfaces that visually order the (otherwise chaotic) latent input space of a GAN that was trained to generate drum samples. Further, these GUIs provide convergent search functions that allow users to fine-tune generated sounds. By doing so, we provide the ability to create sounds more purposefully to sound-affine users such as musicians or sound engineers. Additionally, we present the results of a user study that we conducted in order to explore our approach in accuracy-oriented and creative tasks. Our results indicate that usability and pragmatic qualities play a more important role for users than aesthetic-oriented aspects. Although not improving the accuracy within reproductive tasks, we observed that convergent search functions, if available, were used significantly more often than divergent/randomized search functions.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Schlagowski, R., Mertes, S., & André, E. (2021). Taming the chaos: Exploring graphical input vector manipulation user interfaces for GANs in a musical context. In ACM International Conference Proceeding Series (pp. 216–223). Association for Computing Machinery. https://doi.org/10.1145/3478384.3478411