What is holding back convnets for detection?

Bojan Pepikbo; Rodrigo Benenson; Tobias Ritschel; Bernt Schiele

Conference ProceedingsOPEN ACCESS

What is holding back convnets for detection?

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9358 517-528

DOI: 10.1007/978-3-319-24947-6_43

26Citations

91Readers

Abstract

Convolutional neural networks have recently shown excellent results in general object detection and many other tasks. Albeit very effective, they involve many user-defined design choices. In this paper we want to better understand these choices by inspecting two key aspects “what did the network learn?”, and “what can the network learn?”. We exploit new annotations (Pascal3D+), to enable a new empirical analysis of the R-CNN detector. Despite common belief, our results indicate that existing state-of-the-art convnets are not invariant to various appearance factors. In fact, all considered networks have similar weak points which cannot be mitigated by simply increasing the training data (architectural changes are needed). We show that overall performance can improve when using image renderings as data augmentation. We report the best known results on Pascal3D+ detection and view-point estimation tasks.

Cite

CITATION STYLE

APA

Pepikbo, B., Benenson, R., Ritschel, T., & Schiele, B. (2015). What is holding back convnets for detection? In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9358, pp. 517–528). Springer Verlag. https://doi.org/10.1007/978-3-319-24947-6_43

What is holding back convnets for detection?

Abstract

Cite

Register to see more suggestions