Online anomaly detection using random forest

Zhiruo Zhao; Kishan G. Mehrotra; Chilukuri K. Mohan

Conference Proceedings

Online anomaly detection using random forest

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 10868 LNAI 135-147

DOI: 10.1007/978-3-319-92058-0_13

7Citations

8Readers

Get full text

Abstract

In this paper, we focus on how to use random forests based methods to improve the anomaly detection rate for streaming datasets. The key concept in a current work [12] is to build a random forest where in any tree, at any internal node, a feature is randomly selected and the associated data space is partitioned in half. However, the model parameters were pre-defined and the efficiency on applying this model for various conditions is not discussed. In this paper, we first give mathematical justification of required tree height and number of trees by casting the problem as a classical coupon collector problem. Then we design a majority voting score combination strategy to combine the results from different anomaly detection trees. Finally, we apply feature clustering to group the correlated features together in order to find the anomalies jointly determined by subsets of features.

Cite

CITATION STYLE

APA

Zhao, Z., Mehrotra, K. G., & Mohan, C. K. (2018). Online anomaly detection using random forest. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10868 LNAI, pp. 135–147). Springer Verlag. https://doi.org/10.1007/978-3-319-92058-0_13

Online anomaly detection using random forest

Abstract

Cite

Register to see more suggestions