YOLO: You Only Look 10647 Times

2Citations
Citations of this article
57Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this work, we explore the You Only Look Once (YOLO) single-stage object detection architecture and compare it to the simultaneous classification of 10647 fixed region proposals. We use two different approaches to demonstrate that each of YOLO’s grid cells is attentive to a specific sub-region of previous layers. This finding makes YOLO’s method comparable to local region proposals. Such insight reduces the conceptual gap between YOLO-like single-stage object detection models, R-CNN-like two-stage region proposal based models, and ResNet-like image classification models. For this work, we created interactive exploration tools for a better visual understanding of the YOLO information processing streams: https://limchr.github.io/yolo_visu

Cite

CITATION STYLE

APA

Limberg, C., Melnik, A., Ritter, H., & Prendinger, H. (2023). YOLO: You Only Look 10647 Times. In Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (Vol. 5, pp. 153–160). Science and Technology Publications, Lda. https://doi.org/10.5220/0011677300003417

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free