PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams

Ocean Wu; Yun Sing Koh; Gillian Dobbie; Thomas Lacombe

Conference ProceedingsOPEN ACCESS

PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12085 LNAI 17-30

DOI: 10.1007/978-3-030-47436-2_2

6Citations

7Readers

Abstract

In order to adapt random forests to the dynamic nature of data streams, the state-of-the-art technique discards trained trees and grows new trees when concept drifts are detected. This is particularly wasteful when recurrent patterns exist. In this work, we introduce a novel framework called PEARL, which uses both an exact technique and a probabilistic graphical model with Lossy Counting, to replace drifted trees with relevant trees built from the past. The exact technique utilizes pattern matching to find the set of drifted trees, that co-occurred in predictions in the past. Meanwhile, a probabilistic graphical model is being built to capture the tree replacements among recurrent concept drifts. Once the graphical model becomes stable, it replaces the exact technique and finds relevant trees in a probabilistic fashion. Further, Lossy Counting is applied to the graphical model which brings an added theoretical guarantee for both error rate and space complexity. We empirically show our technique outperforms baselines in terms of cumulative accuracy on both synthetic and real-world datasets.

Author supplied keywords

Cite

CITATION STYLE

APA

Wu, O., Koh, Y. S., Dobbie, G., & Lacombe, T. (2020). PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12085 LNAI, pp. 17–30). Springer. https://doi.org/10.1007/978-3-030-47436-2_2

PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams

Abstract

Author supplied keywords

Cite

Register to see more suggestions