Audio Deepfake Approaches

48Citations
Citations of this article
172Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This paper presents a review of techniques involved in the creation and detection of audio deepfakes, the first section provides information about general deep fakes. In the second section, the main methods for audio deepfakes are outlined and subsequently compared. The results discuss various methods for detecting audio deepfakes, including analyzing statistical properties, examining media consistency, and utilizing machine learning and deep learning algorithms. Major methods used to detect fake audio in these studies included Support Vector Machines (SVMs), Decision Trees (DTs), Convolutional Neural Networks (CNNs), Siamese CNNs, Deep Neural Networks (DNNs), and a combination of CNNs and Recurrent Neural Networks (RNNs). The accuracy of these methods varied, with the highest accuracy being 99% for SVM and the lowest being 73.33% for DT. The Equal Error Rate (EER) was reported in a few of the studies, with the lowest being 2% for Deep-Sonar and the highest being 12.24 for DNN-HLLs. The t-DCF was also reported in some of the studies, with the Siamese CNN performing the best with a 55% improvement in min-t-DCF and EER compared to other methods.

Cite

CITATION STYLE

APA

Shaaban, O. A., Yildirim, R., & Alguttar, A. A. (2023). Audio Deepfake Approaches. IEEE Access, 11, 132652–132682. https://doi.org/10.1109/ACCESS.2023.3333866

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free