Cross Pixel Optical-Flow Similarity for Self-supervised Learning

23Citations
Citations of this article
101Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We propose a novel method for learning convolutional neural image representations without manual supervision. We use motion cues in the form of optical-flow, to supervise representations of static images. The obvious approach of training a network to predict flow from a single image can be needlessly difficult due to intrinsic ambiguities in this prediction task. We instead propose a much simpler learning goal: embed pixels such that the similarity between their embeddings matches that between their optical-flow vectors. At test time, the learned deep network can be used without access to video or flow information and transferred to tasks such as image classification, detection, and segmentation. Our method, which significantly simplifies previous attempts at using motion for self-supervision, achieves state-of-the-art results in self-supervision using motion cues, and is overall state of the art in self-supervised pre-training for semantic image segmentation, as demonstrated on standard benchmarks.

Cite

CITATION STYLE

APA

Mahendran, A., Thewlis, J., & Vedaldi, A. (2019). Cross Pixel Optical-Flow Similarity for Self-supervised Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11365 LNCS, pp. 99–116). Springer Verlag. https://doi.org/10.1007/978-3-030-20873-8_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free