Unsupervised Anomaly Detection on Microservice Traces through Graph VAE

9Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.

Abstract

The microservice architecture is widely employed in large Internet systems. For each user request, a few of the microservices are called, and a trace is formed to record the tree-like call dependencies among microservices and the time consumption at each call node. Traces are useful in diagnosing system failures, but their complex structures make it difficult to model their patterns and detect their anomalies. In this paper, we propose a novel dual-variable graph variational autoencoder (VAE) for unsupervised anomaly detection on microservice traces. To reconstruct the time consumption of nodes, we propose a novel dispatching layer. We find that the inversion of negative log-likelihood (NLL) appears for some anomalous samples, which makes the anomaly score infeasible for anomaly detection. To address this, we point out that the NLL can be decomposed into KL-divergence and data entropy, whereas lower-dimensional anomalies can introduce an entropy gap with normal inputs. We propose three techniques to mitigate this entropy gap for trace anomaly detection: Bernoulli & Categorical Scaling, Node Count Normalization, and Gaussian Std-Limit. On five trace datasets from a top Internet company, our proposed TraceVAE achieves excellent F-scores.

Cite

CITATION STYLE

APA

Xie, Z., Xu, H., Chen, W., Li, W., Jiang, H., Su, L., … Pei, D. (2023). Unsupervised Anomaly Detection on Microservice Traces through Graph VAE. In ACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023 (pp. 2874–2884). Association for Computing Machinery, Inc. https://doi.org/10.1145/3543507.3583215

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free