Online anomaly detection using random forest

7Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we focus on how to use random forests based methods to improve the anomaly detection rate for streaming datasets. The key concept in a current work [12] is to build a random forest where in any tree, at any internal node, a feature is randomly selected and the associated data space is partitioned in half. However, the model parameters were pre-defined and the efficiency on applying this model for various conditions is not discussed. In this paper, we first give mathematical justification of required tree height and number of trees by casting the problem as a classical coupon collector problem. Then we design a majority voting score combination strategy to combine the results from different anomaly detection trees. Finally, we apply feature clustering to group the correlated features together in order to find the anomalies jointly determined by subsets of features.

Cite

CITATION STYLE

APA

Zhao, Z., Mehrotra, K. G., & Mohan, C. K. (2018). Online anomaly detection using random forest. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10868 LNAI, pp. 135–147). Springer Verlag. https://doi.org/10.1007/978-3-319-92058-0_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free