Real Time 3D Pose Estimation of Both Human Hands via RGB-Depth Camera and Deep Convolutional Neural Networks

0Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

3D human hand pose estimation (HPE) is an essential methodology for smart human computer interfaces. Especially, 3D hand pose estimation without attached or hand-held sensors provides a more natural and convenient way. In this work, we present a HPE system with a single RGB-Depth camera and deep learning methodologies which recognizes 3D hand poses of both hands in real-time. Our HPE system consists of four steps: hands detection and segmentation, right and left hand classification using a Convolutional Neural Network (CNN) classifier, hand pose estimation using a deep CNN regressor, and 3D hand pose reconstruction. First, both hands are detected and segmented from each RGB and depth images using skin detection and depth cutting algorithms. Second, a CNN classifier is used to distinguish right and left hands. Our CNN classifier consists of three convolutional layers and two fully connected layers, and uses the segmented depth images as input. Third, a trained deep CNN regressor estimates the key sixteen joints of hands in 3D from the segmented left and right depth hands separately. The regressor is hierarchically composed of multiple convolutional layers, pooling layers and dense fully connected layers to estimate the hand joints from the segmented hand depth images. Finally, 3D hand pose of each hand gets reconstructed from the estimated hand joints. The results show that our CNN classifier distinguishes the right and left hands with an accuracy of 96.9%. The 3D human hand poses are estimated with an average distance error of 8.48 mm. The presented HPE system can be used in various application fields including medical VR, AR, and MR applications. Our presented HPE system should allow natural hand gesture interfaces to interact with various medical contents.

Cite

CITATION STYLE

APA

Gi, G., Kim, T. Y., Park, H. M., Park, J. M., Dinh, D. L., Lee, S. Y., & Kim, T. S. (2020). Real Time 3D Pose Estimation of Both Human Hands via RGB-Depth Camera and Deep Convolutional Neural Networks. In IFMBE Proceedings (Vol. 69, pp. 467–471). Springer Verlag. https://doi.org/10.1007/978-981-13-5859-3_81

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free