Enhancing Audio-Visual Association with Self-Supervised Curriculum Learning

17Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.

Abstract

The recent success of audio-visual representations learning can be largely attributed to their pervasive concurrency property, which can be used as a self-supervision signal and extract correlation information. While most recent works focus on capturing the shared associations between the audio and visual modalities, they rarely consider multiple audio and video pairs at once and pay little attention to exploiting the valuable information within each modality. To tackle this problem, we propose a novel audio-visual representation learning method dubbed self-supervised curriculum learning (SSCL) under the teacher-student learning manner. Specifically, taking advantage of contrastive learning, a two-stage scheme is exploited, which transfers the cross-modal information between teacher and student model as a phased process. The proposed SSCL approach regards the pervasive property of audiovisual concurrency as latent supervision and mutually distills the structure knowledge of visual to audio data. Notably, the SSCL method can learn discriminative audio and visual representations for various downstream applications. Extensive experiments conducted on both action video recognition and audio sound recognition tasks show the remarkably improved performance of the SSCL method compared with the state-of-the-art self-supervised audio-visual representation learning methods.

Cite

CITATION STYLE

APA

Zhang, J., Xu, X., Shen, F., Lu, H., Liu, X., & Shen, H. T. (2021). Enhancing Audio-Visual Association with Self-Supervised Curriculum Learning. In 35th AAAI Conference on Artificial Intelligence, AAAI 2021 (Vol. 4B, pp. 3351–3359). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v35i4.16447

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free