Data stream mining is among the most vital contemporary data science challenges. In this work we concentrate on the issue of actual availability of true class labels. Assumption that the ground truth for each instance becomes known right after processing it is far from being realistic, due to usually high costs connected with its acquisition. Active learning is an attractive solution to this problem, as it selects most valuable instances for labeling. In this paper, we propose to augment the active learning module with self-labeling approach. This allows classifier to automatically label instances for which it displays the highest certainty and use them for further training. Although in this preliminary work we use a static threshold for self-labeling, the obtained results are encouraging. Our experimental study shows that this approach complements the active learning strategy and allows to improve data stream classification, especially in scenarios with very small labeling budget.
CITATION STYLE
Korycki, Ł., & Krawczyk, B. (2018). Combining active learning and self-labeling for data stream mining. In Advances in Intelligent Systems and Computing (Vol. 578, pp. 481–490). Springer Verlag. https://doi.org/10.1007/978-3-319-59162-9_50
Mendeley helps you to discover research relevant for your work.