We present a novel method to obtain a 3D Euclidean reconstruction of both the background and moving objects in a video sequence. We assume that, multiple objects are moving rigidly on a ground plane observed by a moving camera. The video sequence is first segmented into static background and motion blobs by a homography-based motion segmentation method. Then classical "Structure from Motion" (SfM) techniques are applied to obtain a Euclidean reconstruction of the static background. The motion blob corresponding to each moving object is treated as if there were a static object observed by a hypothetical moving camera, called a "virtual camera". This virtual camera shares the same intrinsic parameters with the real camera but moves differently due to object motion. The same SfM techniques are applied to estimate the 3D shape of each moving object and the pose of the virtual camera. We show that the unknown scale of moving objects can be approximately determined by the ground plane, which is a key contribution of this paper. Another key contribution is that we prove that the 3D motion of moving objects can be solved from the virtual camera motion with a linear constraint imposed on the object translation. In our approach, a planartranslation constraint is formulated: "the 3D instantaneous translation of moving objects must be parallel to the ground plane". Results on real-world video sequences demonstrate the effectiveness and robustness of our approach.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below