ResViT: A Framework for Deepfake Videos Detection

7Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Deepfake makes it quite easy to synthesize videos or images using deep learning techniques, which leads to substantial danger and worry for most of the world's renowned people. Spreading false news or synthesizing one's video or image can harm people and their lack of trust on social and electronic media. To efficiently identify deepfake images, we propose ResViT, which uses the ResNet model for feature extraction, while the vision transformer is used for classification. The ResViT architecture uses the feature extractor to extract features from the images of the videos, which are used to classify the input as fake or real. Moreover, the ResViT architectures focus equally on data pre-processing, as it improves performance. We conductedextensive experiments on the five mostly useddatasets. Our analysis revealed that ResViT performed better than the baseline and achieved the prediction accuracy of 80.48%, 87.23%, 75.62%, 78.45%, and 84.55% on Celeb-DF, Celeb-DFv2, FaceForensics++, FF-Deepfake Detection, and DFDC2 datasets, respectively.

Cite

CITATION STYLE

APA

Ahmad, W., Ali, I., Shahzad, S. A., Hashmi, A., & Ghaffar, F. (2022). ResViT: A Framework for Deepfake Videos Detection. International Journal of Electrical and Computer Engineering Systems, 13(9), 807–813. https://doi.org/10.32985/ijeces.13.9.9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free