Diagnosing a performance bug triggered in production cloud environments is notoriously challenging. Extracting performance bug signatures can help cloud operators quickly pinpoint the problem and avoid repeating manual efforts for diagnosing similar performance bugs. In this paper, we present PerfSig, a multi-modality performance bug signature extraction tool which can identify principal anomaly patterns and root cause functions for performance bugs. PerfSig performs fine-grained anomaly detection over various machine data such as system metrics, system logs, and function call traces. We then conduct causal analysis across different machine data using information theory method to pinpoint the root cause function of a performance bug. PerfSig generates bug signatures as the combination of the identified anomaly patterns and root cause functions. We have implemented a prototype of PerfSig and conducted evaluation using 20 real world performance bugs in six commonly used cloud systems. Our experimental results show that PerfSig captures various kinds of fine-grained anomaly patterns from different machine data and successfully identifies the root cause functions through multi-modality causal analysis for 19 out of 20 tested performance bugs.
CITATION STYLE
He, J., Lin, Y., Gu, X., Yeh, C. C. M., & Zhuang, Z. (2022). PerfSig: Extracting Performance Bug Signatures via Multi-modality Causal Analysis. In Proceedings - International Conference on Software Engineering (Vol. 2022-May, pp. 1669–1680). IEEE Computer Society. https://doi.org/10.1145/3510003.3510110
Mendeley helps you to discover research relevant for your work.