Multimodal Neurosymbolic Approach for Explainable Deepfake Detection

  • Haq I
  • Malik K
  • Muhammad K
3Citations
Citations of this article
27Readers
Mendeley users who have this article in their library.

Abstract

Deepfake detection has become increasingly important in recent years owing to the widespread availability of deepfake generation technologies. Existing deepfake detection methods present two primary limitations i.e., trained on a specific type of deepfake dataset, which renders them vulnerable to unseen deepfakes; and they regard deepfakes as a “black-box” with limited explainability, making it difficult for non-AI experts to understand and trust the decisions. Hence, this paper proposes a novel neurosymbolic deepfake detection framework that exploits the fact that human emotions cannot be imitated easily owing to their complex nature. We argue that deep fakes typically exhibit inter- or intra- modality inconsistencies in the emotional expressions of the person being manipulated. Thus, the proposed framework performs inter- and intra- modality reasoning on emotions extracted from audio and visual modalities using a psychological and arousal-valence model for deepfake detection. In addition to fake detection, the proposed framework provides textual explanations for its decisions. The results obtained using Presidential Deepfakes Dataset and World Leaders Dataset of real and manipulated videos demonstrate the effectiveness of our approach in detecting deepfakes and highlight the potential of neurosymbolic approach for expandability.

Cite

CITATION STYLE

APA

Haq, I. U., Malik, K. M., & Muhammad, K. (2023). Multimodal Neurosymbolic Approach for Explainable Deepfake Detection. ACM Transactions on Multimedia Computing, Communications, and Applications. https://doi.org/10.1145/3624748

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free