Acoustic or Pattern? Speech Spoofing Countermeasure based on Image Pre-training Models

2Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

Traditional speech spoofing countermeasures (CM) typically contain a frontend which extracts two-dimensional feature from the waveform, and a Convolutional Neural Network (CNN) based backend classifier. This pipeline is similiar to an image classification task, in some degree. Pre-training is a widely used paradigm in many fields. Self-supervised pre-trained frontends such as Wav2Vec 2.0 have shown superior improvement in the speech spoofing detection task. However, these pre-trained models are only trained by bonafide utterances. Moreover, acoustic pre-trained frontends can also be used in the text-to-speech (TTS) and voice conversion (VC) task, which reveals that commonalities of speech are learnt by them, rather than discriminative information between real and fake data. The speech spoofing detection task and the image classification task share the same pipeline. Based on the hypothesis that CNNs follow the same pattern in capturing artefacts in these two tasks, we apply an image pre-trained CNN model to detect spoofed utterances, counterintuitively. To supplement the model with potentially missing acoustic features, we concatenate Jitter and Shimmer features to the output embedding. Our proposed CM achieves top-level performance on the ASVspoof 2019 dataset.

Cite

CITATION STYLE

APA

Lu, J., Li, Z., Zhang, Y., Wang, W., & Zhang, P. (2022). Acoustic or Pattern? Speech Spoofing Countermeasure based on Image Pre-training Models. In DDAM 2022 - Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia (pp. 77–84). Association for Computing Machinery, Inc. https://doi.org/10.1145/3552466.3556524

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free