Denseformer: A dense transformer framework for person re-identification

11Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

Transformer has shown its effectiveness and advantage in many computer vision tasks, for example, image classification and object re-identification (ReID). However, existing vision transformers are stacked layer by layer, lacking direct information exchange among every layer. Inspired by DenseNet, we propose a dense transformer framework (termed Denseformer) that connects each layer to every other layer through class tokens. We demonstrate that Denseformer can consistently achieve better performance on person ReID tasks across datasets (Market-1501, DukeMTMC, MSMT17, and Occluded-Duke), only at a negligible increase of computation. We show that Denseformer has several compelling advantages: it pays more attention to the main parts of human bodies and obtains discriminative global features.

Cite

CITATION STYLE

APA

Ma, H., Li, X., Yuan, X., & Zhao, C. (2023). Denseformer: A dense transformer framework for person re-identification. IET Computer Vision, 17(5), 527–536. https://doi.org/10.1049/cvi2.12118

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free