For visual estimation of optical flow, which is crucial for various vision analyses, unsupervised learning by view synthesis has emerged as a promising alternative to supervised methods because the ground-truth flow is not readily available in many cases. However, unsupervised learning is likely to be unstable when pixel tracking is lost via occlusion and motion blur, or pixel correspondence is impaired by variations in image content and spatial structure over time. Recognizing that dynamic occlusions and object variations usually exhibit a smooth temporal transition in natural settings, we shifted our focus to model unsupervised learning optical flow from multi-frame sequences of such dynamic scenes. Specifically, we simulated various dynamic scenarios and occlusion phenomena based on Markov property, allowing the model to extract motion laws and thus gain performance in dynamic and occluded areas, which diverges from existing methods without considering temporal dynamics. In addition, we introduced a temporal dynamic model based on a well-designed spatial-temporal dual recurrent block, resulting in a lightweight model structure with fast inference speed. Assuming the temporal smoothness of optical flow, we used the prior motions of adjacent frames to supervise the occluded regions more reliably. Experiments on several optical flow benchmarks demonstrated the effectiveness of our method, as the performance is comparable to several state-of-the-art methods with advantages in memory and computational overhead.
CITATION STYLE
Sun, Z., Luo, Z., & Nishida, S. (2024). Unsupervised learning of optical flow in a multi-frame dynamic environment using temporal dynamic modeling. Complex and Intelligent Systems, 10(2), 2215–2231. https://doi.org/10.1007/s40747-023-01266-2
Mendeley helps you to discover research relevant for your work.