Self-Supervised Pretraining (SSP) has been shown to boost performance for video related tasks such as action recognition and pose estimation. It captures important spatiotemporal constraints which act as an implicit regularizer. This work seeks to leverage upon temporal derivatives and a novel sampling algorithm for sustained (long term) SSP. Main limitations of our baseline approach are – its inadequacy to capture sustained temporal information, weaker sampling algorithm, and the need for parameter tuning. This work analyzes the Temporal Order Verification (TOV) problem in detail, by incorporating multiple temporal derivatives for temporal information amplification and using a novel sampling algorithm that does not need manual parameter adjustment. The key idea is that image-only tuples contain less information and become virtually indiscriminating in case of cyclic events, this can be attenuated by fusing temporal derivatives with the image-only tuples. We explore a few simple yet powerful variants for TOV. One variant uses Motion History Images (MHI), others use optical flow. The proposed TOV algorithm has been compared with previous works along with validation on challenging benchmarks – HMDB51 and UCF101.
CITATION STYLE
Buckchash, H., & Raman, B. (2019). Sustained Self-Supervised Pretraining for Temporal Order Verification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11941 LNCS, pp. 140–149). Springer. https://doi.org/10.1007/978-3-030-34869-4_16
Mendeley helps you to discover research relevant for your work.