Abstract
This paper presents a systematic review of recent advances in music generation using deep learning techniques, categorizing the latest research in the field and identifying key contributions from various approaches. The study examines common data representations in music generation, including raw waveforms, spectrograms, and MIDI, alongside the most prominent deep learning architectures like Generative Adversarial Networks (GANs), Recurrent Neural Networks (RNNs), Variational Autoencoders (VAEs), and Transformer-based models. Through a comparative analysis, the paper highlights the strengths and limitations of these approaches. The findings suggest that GANs with spectrograms and RNNs with MIDI data are particularly effective for generating multi-track music, while autoregressive models like MusicGen and transformer models demonstrate superior performance in capturing long-term dependencies in music generation. Additionally, the paper underscores the emergence of diffusion models, which are gaining popularity for generating high-quality, complex music outputs. The major contribution of this review is the identification of the best-performing models for various music generation tasks and the provision of comprehensive insights into data representation methods, evaluation metrics, and future research directions.
Author supplied keywords
Cite
CITATION STYLE
Mitra, R., & Zualkernan, I. (2025). Music Generation Using Deep Learning and Generative AI: A Systematic Review. IEEE Access. Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/ACCESS.2025.3531798
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.