In this paper we show that combining knowledge of the orientation of a camera with visual information can be used to improve the performance of semantic image segmentation. This is based on the assumption that the direction in which a camera is facing acts as a prior on the content of the images it creates. We gathered egocentric video with a camera attached to a head-mounted display, and recorded its orientation using an inertial sensor. By combining orientation information with typical image descriptors, we show that segmentation of individual images improves in accuracy compared with vision alone, from 61% to 71% over six classes. We also show that this method can be applied to both point and line based features from the image, and that these can be combined together for further benefits. Our resulting system would have applications in autonomous robot locomotion and guiding visually impaired humans.
CITATION STYLE
Haines, O., Bull, D. R., & Burn, J. F. (2016). Fusing Intertial data with vision for enhanced image understanding. Communications in Computer and Information Science, 598, 205–226. https://doi.org/10.1007/978-3-319-29971-6_11
Mendeley helps you to discover research relevant for your work.