PromptDet: Towards Open-Vocabulary Detection Using Uncurated Images

20Citations
Citations of this article
53Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The goal of this work is to establish a scalable pipeline for expanding an object detector towards novel/unseen categories, using zero manual annotations. To achieve that, we make the following four contributions: (i) in pursuit of generalisation, we propose a two-stage open-vocabulary object detector, where the class-agnostic object proposals are classified with a text encoder from pre-trained visual-language model; (ii) To pair the visual latent space (of RPN box proposals) with that of the pre-trained text encoder, we propose the idea of regional prompt learning to align the textual embedding space with regional visual object features; (iii) To scale up the learning procedure towards detecting a wider spectrum of objects, we exploit the available online resource via a novel self-training framework, which allows to train the proposed detector on a large corpus of noisy uncurated web images. Lastly, (iv) to evaluate our proposed detector, termed as PromptDet, we conduct extensive experiments on the challenging LVIS and MS-COCO dataset. PromptDet shows superior performance over existing approaches with fewer additional training images and zero manual annotations whatsoever. Project page with code: https://fcjian.github.io/promptdet.

Cite

CITATION STYLE

APA

Feng, C., Zhong, Y., Jie, Z., Chu, X., Ren, H., Wei, X., … Ma, L. (2022). PromptDet: Towards Open-Vocabulary Detection Using Uncurated Images. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13669 LNCS, pp. 701–717). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-20077-9_41

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free