Combining instance selection and self-training to improve data stream quantification

André G. Maletzke; Denis M. dos Reis; Gustavo E.A.P.A. Batista

Journal ArticleOPEN ACCESS

Combining instance selection and self-training to improve data stream quantification

Journal of the Brazilian Computer Society (2018) 24(1)

DOI: 10.1186/s13173-018-0076-0

18Citations

29Readers

Abstract

In the last years, learning from data streams has attracted the attention of researchers and practitioners due to its large number of applications. These applications have motivated the research community to propose a significant amount of methods to solve problems in diverse tasks, more prominently in classification, clustering, and anomaly detection. However, a relevant task known as quantification has remained mostly unexplored. The quantification goal is to provide an estimate of the class prevalence in an unlabeled set. Recently, we proposed the SQSI algorithm to quantify data streams with concept drifts. SQSI uses a statistical test to identify concept drifts and retrain the classifiers. However, the retraining involves requiring the labels for all newly arrived instances. In this paper, we extend SQSI algorithm by exploring instance selection techniques allied to semi-supervised learning. The idea is to request the classes of a smaller subset of recent examples. Our experiments demonstrate that SQSI’s extension significantly reduces the dependency on actual labels while maintaining or improving the quantification accuracy.

Author supplied keywords

Cite

CITATION STYLE

APA

Maletzke, A. G., dos Reis, D. M., & Batista, G. E. A. P. A. (2018). Combining instance selection and self-training to improve data stream quantification. Journal of the Brazilian Computer Society, 24(1). https://doi.org/10.1186/s13173-018-0076-0

Combining instance selection and self-training to improve data stream quantification

Abstract

Author supplied keywords

Cite

Register to see more suggestions