The practical application of machine perception to support physical manipulation in unstructured environments remains a barrier to the development of intelligent robotic systems. Recently, great progress has been made by the large-scale machine perception community, but these methods have made few contributions to the applied robotic perception. This is in part because such large-scale systems are designed to recognize category labels of large numbers of objects from a single image, rather than highly accurate, efficient, and robust pose estimation in environments for which a robot has reliable prior knowledge. In this paper, we illustrate the potential for synergistic integration of modern computer vision methods into robotics by augmenting a RANSAC-based registration method with a state-of-the art semantic segmentation algorithm. We detail a convolutional architecture for semantic labeling of the scene, modified to operate efficiently using integral images. We combine this labeling with two novel scene parsing variants of RANSAC, and show, on a new RGB-D dataset that contains complex configurations of textureless and highly specular objects, that our method demonstrates improved performance of pose estimation over the unaugmented algorithms.
CITATION STYLE
Li, C., Bohren, J., & Hager, G. D. (2018). Bridging the Robot Perception Gap with Mid-Level Vision. In Springer Proceedings in Advanced Robotics (Vol. 3, pp. 5–20). Springer Science and Business Media B.V. https://doi.org/10.1007/978-3-319-60916-4_1
Mendeley helps you to discover research relevant for your work.