In this paper, we propose TSFE-Net, two stream feature extraction networks for active stereo matching. First, we perform extra local contrast normalization (LCN) for dataset due to dependency between speckle intensity and distance. Second, we construct two stream feature extraction layers which consist of convolutional layers and deconvolutional layers in different scales to simultaneously learn the features of the original images and LCN images and aggregate context information to form the left and right features. Third, we convert the obtained depth map into disparity map in virtue of camera parameters to construct a supervised learning model. The TSFE-Net not only solves illumination effects between speckle intensity and distance but also reserves details of the original image. Our dataset are captured by RealSense D435 camera. We research extensive quantitative and qualitative evaluations based on a series of scenes, and achieve the end point error (EPE) accuracy of 0.335 on the TITAN XP platform only for valid pixel. The assessment results show that our network has the ability of real-time deep reconstruction for active pattern.
CITATION STYLE
Zeng, H., Wang, B., Zhou, X., Sun, X., Huang, L., Zhang, Q., & Wang, Y. (2021). TSFE-Net: Two-Stream Feature Extraction Networks for Active Stereo Matching. IEEE Access, 9, 33954–33962. https://doi.org/10.1109/ACCESS.2021.3061495
Mendeley helps you to discover research relevant for your work.