Fully Automated End-to-End Fake Audio Detection

16Citations
Citations of this article
21Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The existing fake audio detection systems often rely on expert experience to design the acoustic features or manually design the hyperparameters of the network structure. However, artificial adjustment of the parameters can have a relatively obvious influence on the results. It is almost impossible to manually set the best set of parameters. Therefore this paper proposes a fully automated end-toend fake audio detection method. We first use wav2vec pre-trained model to obtain a high-level representation of the speech. Furthermore, for the network structure, we use a modified version of the differentiable architecture search (DARTS) named light-DARTS. It learns deep speech representations while automatically learning and optimizing complex neural structures consisting of convolutional operations and residual blocks. The experimental results on the ASVspoof 2019 LA dataset show that our proposed system achieves an equal error rate (EER) of 1.08%, which outperforms the state-of-the-art single system.

Cite

CITATION STYLE

APA

Wang, C., Yi, J., Tao, J., Sun, H., Chen, X., Tian, Z., … Fu, R. (2022). Fully Automated End-to-End Fake Audio Detection. In DDAM 2022 - Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia (pp. 27–33). Association for Computing Machinery, Inc. https://doi.org/10.1145/3552466.3556530

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free