No More Shortcuts: Realizing the Potential of Temporal Self-Supervision

6Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

Self-supervised approaches for video have shown impressive results in video understanding tasks. However, unlike early works that leverage temporal self-supervision, current state-of-the-art methods primarily rely on tasks from the image domain (e.g., contrastive learning) that do not explicitly promote the learning of temporal features. We identify two factors that limit existing temporal self-supervision: 1) tasks are too simple, resulting in saturated training performance, and 2) we uncover shortcuts based on local appearance statistics that hinder the learning of high-level features. To address these issues, we propose 1) a more challenging reformulation of temporal self-supervision as frame-level (rather than clip-level) recognition tasks and 2) an effective augmentation strategy to mitigate shortcuts. Our model extends a representation of single video frames, pre-trained through contrastive learning, with a transformer that we train through temporal self-supervision. We demonstrate experimentally that our more challenging frame-level task formulations and the removal of shortcuts drastically improve the quality of features learned through temporal self-supervision. Our extensive experiments show state-of-the-art performance across 10 video understanding datasets, illustrating the generalization ability and robustness of our learned video representations. Project Page: https://daveishan.github.io/nms-webpage.

Cite

CITATION STYLE

APA

Dave, I. R., Jenni, S., & Shah, M. (2024). No More Shortcuts: Realizing the Potential of Temporal Self-Supervision. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 38, pp. 1481–1491). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v38i2.27913

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free