Predicting Video Frames Using Feature Based Locally Guided Objectives

1Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents feature reconstruction based approach using Generative Adversarial Networks (GAN) to solve the problem of predicting future frames from natural video scenes. Recent GAN based methods often generate blurry outcomes and fail miserably in case of long-range prediction. Our proposed method incorporates an intermediate feature generating GAN to minimize the disparity between the ground truth and predicted outputs. For this, we propose two novel objective functions: (a) Locally Guided Gram Loss (LGGL) and (b) Multi-Scale Correlation Loss (MSCL) to further enhance the quality of the predicted frames. LGGL aides the feature generating GAN to maximize the similarity between the intermediate features of the ground-truth and the network output by constructing Gram matrices from locally extracted patches over several levels of the generator. MSCL incorporates a correlation based objective to effectively model the temporal relationships between the predicted and ground-truth frames at the frame generating stage. Our proposed model is end-to-end trainable and exhibits superior performance compared to the state-of-the-art on four real-world benchmark video datasets.

Cite

CITATION STYLE

APA

Bhattacharjee, P., & Das, S. (2019). Predicting Video Frames Using Feature Based Locally Guided Objectives. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11364 LNCS, pp. 679–695). Springer Verlag. https://doi.org/10.1007/978-3-030-20870-7_42

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free