Abstract
This paper proposes a 2D-3D supervised Fusionformer method for current 3D human pose estimation. It introduces self-trajectory module and cross-trajectory module to capture the motion differences and synergy of different joints. In addition, the created Global Local Fusion Block (GLF) combines global spatio-temporal pose features and local joint trajectory features in parallel. Furthermore, to eliminate the impact of poor 2D poses on 3D projection, a pose refinement network is introduced to balance the consistency of the 3D projection. Finally, the proposed method is evaluated on two benchmark datasets: Human3.6M and MPI-INF-3DHP. Compared to Poseformer and MGCN baseline methods, the results show an improvement of 3.0% MPJPE and 2.0% MPJPE on the Human3.6M dataset. By fully exploiting the characteristics of local joint synergy and adaptively fusing them with global pose features, our method demonstrates superior performance in 3D human pose estimation.
Author supplied keywords
Cite
CITATION STYLE
Yu, X. (2024). Fusionformer: Exploiting the joint motion synergy with fusion network based on transformer for 3D human pose estimation. In Journal of Physics: Conference Series (Vol. 2786). Institute of Physics. https://doi.org/10.1088/1742-6596/2786/1/012015
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.