Different to traditional works on frame-level features and temporal characteristics, we propose a deepfake video detection method based on visual-audio synchronism, which compares the audio stream and the visual stream by an improved siamese neural network. We combine the audio stream and visual stream as a 2-channel input and design a 2-branches network to achieve the visual-audio synchronism detection. Preliminary experiments demonstrate the efficiency of the proposed method, which can achieve the highest accuracy compared with other existing methods.
CITATION STYLE
Fan, Z., Zhan, J., & Jiang, W. (2021). Detecting deepfake videos by visual-audio synchronism: work-in-progress. In Proceedings - 2021 International Conference on Embedded Software, EMSOFT 2021 (pp. 31–32). Association for Computing Machinery, Inc. https://doi.org/10.1145/3477244.3477615
Mendeley helps you to discover research relevant for your work.