Pose-Forecasting Aided Human Video Prediction with Graph Convolutional Networks

5Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Human video prediction is still a challenging problem due to the uncertainty of future actions and complexity of frame details. Recent methods tackle this problem in two steps: firstly to forecast future human poses from the initial ones, and then to generate realistic frames conditioned on predicted poses. Following this framework, we propose a novel Graph Convolutional Network (GCN) based pose predictor to comprehensively model human body joints and forcast their positions holistically, and also a stacked generative model with a temporal discriminator to iteratively refine the quality of the generated videos. The GCN based pose predictor fully considers the relationships among body joints and produces more plausible pose predictions. With the guidance of predicted poses, a temporal discriminator encodes temporal information into future frame generation to achieve high-quality results. Furthermore, stacked residual refinement generators make the results more realistic. Extensive experiments on benchmark datasets demonstrate that the proposed method produces better predictions than state-of-the-arts and achieves up to 15% improvement in PSNR.

Cite

CITATION STYLE

APA

Zhao, Y., & Dou, Y. (2020). Pose-Forecasting Aided Human Video Prediction with Graph Convolutional Networks. IEEE Access, 8, 147256–147264. https://doi.org/10.1109/ACCESS.2020.2995383

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free