Active Token Mixer

14Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

Abstract

The three existing dominant network families, i.e., CNNs, Transformers, and MLPs, differ from each other mainly in the ways of fusing spatial contextual information, leaving designing more effective token-mixing mechanisms at the core of backbone architecture development. In this work, we propose an innovative token-mixer, dubbed Active Token Mixer (ATM), to actively incorporate flexible contextual information distributed across different channels from other tokens into the given query token. This fundamental operator actively predicts where to capture useful contexts and learns how to fuse the captured contexts with the query token at channel level. In this way, the spatial range of token-mixing can be expanded to a global scope with limited computational complexity, where the way of token-mixing is reformed. We take ATM as the primary operator and assemble ATMs into a cascade architecture, dubbed ATMNet. Extensive experiments demonstrate that ATMNet is generally applicable and comprehensively surpasses different families of SOTA vision backbones by a clear margin on a broad range of vision tasks, including visual recognition and dense prediction tasks. Code is available at https://github.com/microsoft/ActiveMLP.

References Powered by Scopus

Deep residual learning for image recognition

174328Citations
N/AReaders
Get full text

ImageNet: A Large-Scale Hierarchical Image Database

51115Citations
N/AReaders
Get full text

Gradient-based learning applied to document recognition

44103Citations
N/AReaders
Get full text

Cited by Powered by Scopus

CycleMLP: A MLP-Like Architecture for Dense Visual Predictions

30Citations
N/AReaders
Get full text

Adaptive Frequency Filters As Efficient Global Token Mixers

19Citations
N/AReaders
Get full text

MD-UNet: a medical image segmentation network based on mixed depthwise convolution

11Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Wei, G., Zhang, Z., Lan, C., Lu, Y., & Chen, Z. (2023). Active Token Mixer. In Proceedings of the 37th AAAI Conference on Artificial Intelligence, AAAI 2023 (Vol. 37, pp. 2759–2767). AAAI Press. https://doi.org/10.1609/aaai.v37i3.25376

Readers' Seniority

Tooltip

Researcher 4

80%

PhD / Post grad / Masters / Doc 1

20%

Readers' Discipline

Tooltip

Computer Science 3

75%

Medicine and Dentistry 1

25%

Save time finding and organizing research with Mendeley

Sign up for free