A Study on Imbalanced Data Streams

Ehsan Aminian; Rita P. Ribeiro; João Gama

Conference ProceedingsOPEN ACCESS

A Study on Imbalanced Data Streams

Communications in Computer and Information Science (2020) 1168 CCIS 380-389

DOI: 10.1007/978-3-030-43887-6_31

4Citations

8Readers

Abstract

Data are growing fast in today’s world and great portion of that is in the form of stream. In many situations, data streams are imbalanced making it difficult to use with classical data mining methods. However, mining these special kinds of streams is one of the most attractive research area. In this paper, we propose two algorithms for learning from imbalanced regression data streams. Both methods are based on Chebychev’s inequality but in a different way. The first method, under-samples from the frequent target value examples while the second method over-samples the rare and extreme target value examples. This way, the learner will focus in the rare and more difficult cases. We applied our methods to train regression models using two benchmark datasets and two well-known regression algorithms: Perceptron and FIMT-DD. Our obtained results from the simulations indicate the usefulness of our proposed methods.

Author supplied keywords

Cite

CITATION STYLE

APA

Aminian, E., Ribeiro, R. P., & Gama, J. (2020). A Study on Imbalanced Data Streams. In Communications in Computer and Information Science (Vol. 1168 CCIS, pp. 380–389). Springer. https://doi.org/10.1007/978-3-030-43887-6_31

A Study on Imbalanced Data Streams

Abstract

Author supplied keywords

Cite

Register to see more suggestions