YOLO: You Only Look 10647 Times

Christian Limberg; Andrew Melnik; Helge Ritter; Helmut Prendinger

Conference ProceedingsOPEN ACCESS

YOLO: You Only Look 10647 Times

Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (2023) 5 153-160

DOI: 10.5220/0011677300003417

2Citations

57Readers

Get full text

Abstract

In this work, we explore the You Only Look Once (YOLO) single-stage object detection architecture and compare it to the simultaneous classification of 10647 fixed region proposals. We use two different approaches to demonstrate that each of YOLO’s grid cells is attentive to a specific sub-region of previous layers. This finding makes YOLO’s method comparable to local region proposals. Such insight reduces the conceptual gap between YOLO-like single-stage object detection models, R-CNN-like two-stage region proposal based models, and ResNet-like image classification models. For this work, we created interactive exploration tools for a better visual understanding of the YOLO information processing streams: https://limchr.github.io/yolo_visu

Author supplied keywords

Cite

CITATION STYLE

APA

Limberg, C., Melnik, A., Ritter, H., & Prendinger, H. (2023). YOLO: You Only Look 10647 Times. In Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (Vol. 5, pp. 153–160). Science and Technology Publications, Lda. https://doi.org/10.5220/0011677300003417

YOLO: You Only Look 10647 Times

Abstract

Author supplied keywords

Cite

Register to see more suggestions