Abstract
Human pose estimation is a challenging task due to its structured data sequence nature. Existing methods primarily focus on pair-wise interaction of body joints, which is insufficient for scenarios involving overlapping joints and rapidly changing poses. To overcome these issues, we introduce a novel approach, the High-order Directed Transformer (HDFormer), which leverages high-order bone and joint relationships for improved pose estimation. Specifically, HDFormer incorporates both self-attention and high-order attention to formulate a multi-order attention module. This module facilitates first-order”joint?joint”, second-order”bone?joint”, and high-order”hyperbone?joint” interactions, effectively addressing issues in complex and occlusion-heavy situations. In addition, modern CNN techniques are integrated into the transformer-based architecture, balancing the trade-off between performance and efficiency. HDFormer significantly outperforms state-of-the-art (SOTA) models on Human3.6M and MPI-INF-3DHP datasets, requiring only 1/10 of the parameters and significantly lower computational costs. Moreover, HDFormer demonstrates broad real-world applicability, enabling real-time, accurate 3D pose estimation. The source code is at https://github.com/hyer/HDFormer.
Cite
CITATION STYLE
Chen, H., He, J. Y., Xiang, W., Cheng, Z. Q., Liu, W., Liu, H., … Xie, X. (2023). HDFormer: High-order Directed Transformer for 3D Human Pose Estimation. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2023-August, pp. 581–589). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2023/65
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.