Ego-pose estimation, i.e., estimating a person’s 3D pose with a single wearable camera, has many potential applications in activity monitoring. For these applications, both accurate and physically plausible estimates are desired, with the latter often overlooked by existing work. Traditional computer vision-based approaches using temporal smoothing only take into account the kinematics of the motion without considering the physics that underlies the dynamics of motion, which leads to pose estimates that are physically invalid. Motivated by this, we propose a novel control-based approach to model human motion with physics simulation and use imitation learning to learn a video-conditioned control policy for ego-pose estimation. Our imitation learning framework allows us to perform domain adaption to transfer our policy trained on simulation data to real-world data. Our experiments with real egocentric videos show that our method can estimate both accurate and physically plausible 3D ego-pose sequences without observing the cameras wearer’s body.
CITATION STYLE
Yuan, Y., & Kitani, K. (2018). 3D Ego-Pose Estimation via Imitation Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11220 LNCS, pp. 763–778). Springer Verlag. https://doi.org/10.1007/978-3-030-01270-0_45
Mendeley helps you to discover research relevant for your work.