Abstract
Audio steganography exploits redundancies in the human auditory system to conceal secret information within cover audio, ensuring that the hidden data remains undetectable during normal listening. However, recent research shows that current audio steganography techniques are vulnerable to detection by deep learning-based steganalyzers, which analyze the high-dimensional features of stego audio for classification. While deep learning-based steganography has been extensively studied for image covers, its application to audio remains underexplored, particularly in achieving robust embedding and extraction with minimal perceptual distortion. We propose a diffusion-based audio steganography model comprising two primary modules: (i) a diffusion-based embedding module that autonomously integrates secret messages into cover audio while preserving high perceptual quality and (ii) a corresponding diffusion-based extraction module that accurately recovers the embedded data. The framework supports both pre-existing cover audio and the generation of high-quality steganographic cover audio with superior perceptual quality for message embedding. After training, the model achieves state-of-the-art performance in terms of embedding capacity and resistance to detection by deep learning steganalyzers. The experimental results demonstrate that our diffusion-based approach significantly outperforms existing methods across varying embedding rates, yielding stego audio with superior auditory quality and lower detectability.
Author supplied keywords
Cite
CITATION STYLE
Xi, J., Xia, Z., Zhang, W., Xie, Y., & Zhao, L. (2025). Diffusion-Based Model for Audio Steganography. Electronics (Switzerland), 14(20). https://doi.org/10.3390/electronics14204019
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.