We propose to self-supervise a convolutional neural network operating on images using temporal information from videos. The task is to learn a representation of single images and the supervision for this is obtained by learning to group image pixels in such a way that their collective motion is “coherent”. This learning by grouping approach is used as a pre-training as well as segmentation strategy. Preliminary results suggest that the segments obtained are reasonable and the representation learned transfers well for classification.
CITATION STYLE
Mahendran, A., Thewlis, J., & Vedaldi, A. (2019). Self-supervised segmentation by grouping optical-flow. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11133 LNCS, pp. 528–534). Springer Verlag. https://doi.org/10.1007/978-3-030-11021-5_31
Mendeley helps you to discover research relevant for your work.