HaloAE: A Local Transformer Auto-Encoder for Anomaly Detection and Localization Based on HaloNet

Emilie Mathian; Huidong Liu; Lynnette Fernandez-Cuesta; Dimitris Samaras; Matthieu Foll; Liming Chen

Conference ProceedingsOPEN ACCESS

HaloAE: A Local Transformer Auto-Encoder for Anomaly Detection and Localization Based on HaloNet

Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (2023) 5 325-337

DOI: 10.5220/0011865900003417

1Citations

13Readers

Get full text

Abstract

Unsupervised anomaly detection and localization is a crucial task in many applications, e.g., defect detection in industry, cancer localization in medicine, and requires both local and global information as enabled by the self-attention in Transformer. However, brute force adaptation of Transformer, e.g., ViT, suffers from two issues: 1) the high computation complexity, making it hard to deal with high-resolution images; and 2) patch-based tokens, which are inappropriate for pixel-level dense prediction tasks, e.g., anomaly localization,and ignores intra-patch interactions. We present HaloAE, the first auto-encoder based on a local 2D version of Transformer with HaloNet allowing intra-patch correlation computation with a receptive field covering 25% of the input image. HaloAE combines convolution and local 2D block-wise self-attention layers and performs anomaly detection and segmentation through a single model. Moreover, because the loss function is generally a weighted sum of several losses, we also introduce a novel dynamic weighting scheme to better optimize the learning of the model. The competitive results on the MVTec dataset suggest that vision models incorporating Transformer could benefit from a local computation of the self-attention operation, and its very low computational cost and pave the way for applications on very large imagesa

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Mathian, E., Liu, H., Fernandez-Cuesta, L., Samaras, D., Foll, M., & Chen, L. (2023). HaloAE: A Local Transformer Auto-Encoder for Anomaly Detection and Localization Based on HaloNet. In Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (Vol. 5, pp. 325–337). Science and Technology Publications, Lda. https://doi.org/10.5220/0011865900003417

Readers' Seniority

PhD / Post grad / Masters / Doc 3

60%

Lecturer / Post doc 1

20%

Researcher 1

20%

Readers' Discipline

Computer Science 4

67%

Biochemistry, Genetics and Molecular Bi... 1

17%

Engineering 1

17%

HaloAE: A Local Transformer Auto-Encoder for Anomaly Detection and Localization Based on HaloNet

Abstract

Author supplied keywords

References Powered by Scopus

Deep residual learning for image recognition

ImageNet: A Large-Scale Hierarchical Image Database

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions

Cited by Powered by Scopus

Assessment of the current and emerging criteria for the histopathological classification of lung neuroendocrine tumours in the lungNENomics project

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline