Improving Semi-Supervised Differentiable Synthesizer Sound Matching for Practical Applications

21Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

While synthesizers have become commonplace in music production, many users find it difficult to control the parameters of a synthesizer to create a sound as they intended. In order to assist the user, the sound matching task aims to estimate synthesis parameters that produce a sound that is as close as possible to the query sound. Recently, neural networks have been employed for this task. These neural networks are trained on paired data of synthesis parameters and the corresponding output sound, optimizing a loss of synthesis parameters. However, query by the user usually consists of real-world sounds, different from the synthesizer output sounds used as training data. In a previous work, the authors presented a sound matching method where the synthesizer is implemented using differentiable DSP. The estimator network could then be trained by directly optimizing the spectral similarity between the original sound and the output sound. Furthermore, the network could be trained on real-world sounds whose ground-truth synthesis parameters are unavailable. This method was shown to improve the match quality in both objective and subjective measures. In this work, we experiment with different synthesizer configurations and extend this approach to a more practical synthesizer with effect modules and envelope generators. We propose a novel training strategy where the network is fully trained using both parameter loss and spectral loss. We show that models trained using this strategy is able to utilize the chorus effect effectively while models that switch completely to spectral loss underutilizes the chorus effect.

Cite

CITATION STYLE

APA

Masuda, N., & Saito, D. (2023). Improving Semi-Supervised Differentiable Synthesizer Sound Matching for Practical Applications. IEEE/ACM Transactions on Audio Speech and Language Processing, 31, 863–875. https://doi.org/10.1109/TASLP.2023.3237161

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free