2D Fingertip Localization on Depth Videos Using Paired Video-to-Video Translation

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We propose a two-stage pipeline and formulate 2D hand keypoint localization as a problem of conditional video generation. The goal is to learn a mapping function from an input depth video in the source domain to an output depth video along with 5 color marks on each fingertip by enforcing temporal consistency constraints. Next, by applying color segmentation techniques in HSV domain, we extract the center of each segmented part as 2D coordinates of fingertips on the translated video. To the best of our knowledge, this is the first work on fingertip localization on depth videos through domain adaptation. Our comparative experimental results with the state-of-the-art single-frame hand pose estimation on the challenging NYU dataset demonstrates that by exploiting temporal information, our model manifests better hand appearance consistency in video-to-video synthesis stage which leads to accurate estimations of 2D hand poses under motion blur by fast hand motion.

Cite

CITATION STYLE

APA

Farahanipad, F., Nasr, M. S., Rezaei, M., Kamangar, F., Athitsos, V., & Huber, M. (2022). 2D Fingertip Localization on Depth Videos Using Paired Video-to-Video Translation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13599 LNCS, pp. 381–392). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-20716-7_30

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free