We describe an approach to incorporate scene topology and semantics into pixel-level object detection and localization. Our method requires video to determine occlusion regions and thence local depth ordering, and any visual recognition scheme that provides a score at local image regions, for instance object detection probabilities. We set up a cost functional that incorporates occlusion cues induced by object boundaries, label consistency and recognition priors, and solve it using a convex optimization scheme. We show that our method improves localization accuracy of existing recognition approaches, or equivalently provides semantic labels to pixel-level localization and segmentation. © 2013 Springer-Verlag.
CITATION STYLE
Taylor, B., Ayvaci, A., Ravichandran, A., & Soatto, S. (2013). Semantic video segmentation from occlusion relations within a convex optimization framework. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8081 LNCS, pp. 195–208). https://doi.org/10.1007/978-3-642-40395-8_15
Mendeley helps you to discover research relevant for your work.