Combining active learning and self-labeling for data stream mining

Łlukasz Korycki; Bartosz Krawczyk

Conference Proceedings

Combining active learning and self-labeling for data stream mining

Advances in Intelligent Systems and Computing (2018) 578 481-490

DOI: 10.1007/978-3-319-59162-9_50

11Citations

20Readers

Get full text

Abstract

Data stream mining is among the most vital contemporary data science challenges. In this work we concentrate on the issue of actual availability of true class labels. Assumption that the ground truth for each instance becomes known right after processing it is far from being realistic, due to usually high costs connected with its acquisition. Active learning is an attractive solution to this problem, as it selects most valuable instances for labeling. In this paper, we propose to augment the active learning module with self-labeling approach. This allows classifier to automatically label instances for which it displays the highest certainty and use them for further training. Although in this preliminary work we use a static threshold for self-labeling, the obtained results are encouraging. Our experimental study shows that this approach complements the active learning strategy and allows to improve data stream classification, especially in scenarios with very small labeling budget.

Author supplied keywords

Cite

CITATION STYLE

APA

Korycki, Ł., & Krawczyk, B. (2018). Combining active learning and self-labeling for data stream mining. In Advances in Intelligent Systems and Computing (Vol. 578, pp. 481–490). Springer Verlag. https://doi.org/10.1007/978-3-319-59162-9_50

Combining active learning and self-labeling for data stream mining

Abstract

Author supplied keywords

Cite

Register to see more suggestions