Monocular 3D scene modeling and inference: Understanding multi-object traffic scenes

33Citations
Citations of this article
133Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Scene understanding has (again) become a focus of computer vision research, leveraging advances in detection, context modeling, and tracking. In this paper, we present a novel probabilistic 3D scene model that encompasses multi-class object detection, object tracking, scene labeling, and 3D geometric relations. This integrated 3D model is able to represent complex interactions like inter-object occlusion, physical exclusion between objects, and geometric context. Inference allows to recover 3D scene context and perform 3D multiobject tracking from a mobile observer, for objects of multiple categories, using only monocular video as input. In particular, we show that a joint scene tracklet model for the evidence collected over multiple frames substantially improves performance. The approach is evaluated for two different types of challenging onboard sequences. We first show a substantial improvement to the state-of-the-art in 3D multi-people tracking. Moreover, a similar performance gain is achieved for multi-class 3D tracking of cars and trucks on a new, challenging dataset. © 2010 Springer-Verlag.

Cite

CITATION STYLE

APA

Wojek, C., Roth, S., Schindler, K., & Schiele, B. (2010). Monocular 3D scene modeling and inference: Understanding multi-object traffic scenes. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6314 LNCS, pp. 467–481). Springer Verlag. https://doi.org/10.1007/978-3-642-15561-1_34

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free