HALO: Hierarchy-aware Fault Localization for Cloud Systems

25Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A typical cloud system has a large amount of telemetry data collected by pervasive software monitors that keep tracking the health status of the system. The telemetry data is essentially multi-dimensional data, which contains attributes and failure/success status of the system being monitored. By identifying the attribute value combinations where the failures are mostly concentrated (which we call fault-indicating combination), we can localize the cause of system failures into a smaller scope, thus facilitating fault diagnosis. However, due to the combinatorial explosion problem and the latent hierarchical structure in cloud telemetry data, it is still intractable to localize the fault to a proper granularity in an efficient way. In this paper, we propose HALO, a hierarchy-aware fault localization approach for locating the fault-indicating combinations from telemetry data. Our approach automatically learns the hierarchical relationship among attributes and leverages the hierarchy structure for precise and efficient fault localization. We have evaluated HALO on both industrial and synthetic datasets and the results confirm that HALO outperforms the existing methods. Furthermore, we have successfully deployed HALO to different services in Microsoft Azure and Microsoft 365, witnessed its impact in real-world practice.

Cite

CITATION STYLE

APA

Zhang, X., Du, C., Li, Y., Xu, Y., Zhang, H., Qin, S., … Zhang, D. (2021). HALO: Hierarchy-aware Fault Localization for Cloud Systems. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 3948–3958). Association for Computing Machinery. https://doi.org/10.1145/3447548.3467190

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free