A Pipeline for Hand 2-D Keypoint Localization Using Unpaired Image to Image Translation

10Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

Hand pose estimation is getting a lot of attention in many areas such as Human-Computer Interaction and Sign Language Recognition. A fundamental step to accurately estimate the hand pose involves detecting and localizing fingertips in an image. Despite the progress of 2-D hand pose estimation in recent studies, accurate and robust detection and localization of fingertips still remains a challenging task due to low resolution of a fingertip in images and varying lightning condition. Inspired by the progress of the Generative Adversarial Network (GAN) and image-style transfer, we propose a two-stage pipeline to accurately localize the fingertip position even in varying lighting and severe self occlusion on depth images. The idea is to use a Cycle-consistent Generative Adversarial Network (Cycle-GAN) to apply unpaired image-to-image translation and generate a depth image with colored predictions on the fingertips, wrist, and palm given a real depth image. The model is trained in a semi-supervised manner using a collection of images from source and target domains that do not need to be related in anyway. Then, by applying color segmentation techniques, we localize the center of each colored area which results in finding the location of each fingertip along with center of the wrist and the palm. The proposed method achieves visually promising results on noisy depth images captured using the Microsoft Kinect. Experiments on the challenging NYU hand dataset have demonstrated that our approach not only generates plausible samples, but also outperforms state-of-the-art approaches on 2-D fingertip estimation by a significant margin even in the presence of severe self-occlusion and varying lighting conditions. Moreover, fingertips would be detected irrespective of user orientation using this method.

Cite

CITATION STYLE

APA

Farahanipad, F., Rezaei, M., Dillhoff, A., Kamangar, F., & Athitsos, V. (2021). A Pipeline for Hand 2-D Keypoint Localization Using Unpaired Image to Image Translation. In ACM International Conference Proceeding Series (pp. 226–233). Association for Computing Machinery. https://doi.org/10.1145/3453892.3453904

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free