Log data contains very rich and valuable information that records system states and behavior, which can be used to diagnose system failures. Anomaly detection from large-scale log data plays a key role in building secure and trustworthy systems. Anomaly detection model based on machine learning has achieved good results in practical applications. However, logs generated by modern large-scale distributed systems are more complex than ever before in terms of data size and variety. Therefore, the traditional single-machine learning anomaly detection model faces the model aging problem. We design an anomaly detection model that combines multiple machine learning algorithms. By using a conformal prediction, we can calculate the confidence of each algorithm for each log to be detected and use statistical analysis to tag them with a trusted label. The approach was tested on the public HDFS_100k log dataset, and the results show that our model is more accurate.
CITATION STYLE
Xie, X., Jin, Z., Han, Q., Huang, S., & Li, T. (2019). A confidence-guided anomaly detection approach jointly using multiple machine learning algorithms. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11983 LNCS, pp. 93–100). Springer. https://doi.org/10.1007/978-3-030-37352-8_8
Mendeley helps you to discover research relevant for your work.