We propose a framework for detecting, extracting and modeling objects in natural scenes from multi-modal data. Our framework is iterative, exploiting different hypotheses in a complementary manner. We employ the framework in realistic scenarios, based on visual appearance and depth information. Using a robotic manipulator that interacts with the scene, object hypotheses generated using appearance information are confirmed through pushing. The framework is iterative, each generated hypothesis is feeding into the subsequent one, continuously refining the predictions about the scene. We show results that demonstrate the synergic effect of applying multiple hypotheses for real-world scene understanding. The method is efficient and performs in real-time. © 2011 Springer-Verlag.
CITATION STYLE
Bergström, N., Ek, C. H., Björkman, M., & Kragic, D. (2011). Scene understanding through autonomous interactive perception. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6962 LNCS, pp. 153–162). https://doi.org/10.1007/978-3-642-23968-7_16
Mendeley helps you to discover research relevant for your work.