Combining active learning and self-labeling for data stream mining

11Citations
Citations of this article
20Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Data stream mining is among the most vital contemporary data science challenges. In this work we concentrate on the issue of actual availability of true class labels. Assumption that the ground truth for each instance becomes known right after processing it is far from being realistic, due to usually high costs connected with its acquisition. Active learning is an attractive solution to this problem, as it selects most valuable instances for labeling. In this paper, we propose to augment the active learning module with self-labeling approach. This allows classifier to automatically label instances for which it displays the highest certainty and use them for further training. Although in this preliminary work we use a static threshold for self-labeling, the obtained results are encouraging. Our experimental study shows that this approach complements the active learning strategy and allows to improve data stream classification, especially in scenarios with very small labeling budget.

Cite

CITATION STYLE

APA

Korycki, Ł., & Krawczyk, B. (2018). Combining active learning and self-labeling for data stream mining. In Advances in Intelligent Systems and Computing (Vol. 578, pp. 481–490). Springer Verlag. https://doi.org/10.1007/978-3-319-59162-9_50

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free