Empirical similarity for absent data generation in imbalanced classification

Arash Pourhabib

Book Chapter

Empirical similarity for absent data generation in imbalanced classification

Pourhabib A

Springer, (2020), 1010-1030

DOI: 10.1007/978-3-030-12388-8_70

1Citations

4Readers

Get full text

Abstract

When the training data in a two-class classification problem is overwhelmed by one class, most classification techniques fail to correctly identify the data points belonging to the underrepresented class. This paper proposes Similarity-based Imbalanced Classification (SBIC) that simultaneously optimizes the weights of the empirical similarity function and identifies the locations of absent data points, i.e. unobserved data points from the minority class. Similar to cost-sensitive approaches, SBIC operates on an algorithmic level to handle imbalanced structures and similar to synthetic data generation approaches, it utilizes the properties of unobserved data points. The main contribution of the paper is to show that a similarity function can be used to tackle imbalanced datasets. The results of applying the proposed method to imbalanced datasets suggests that SBIC is comparable to, and in some cases outperforms, other commonly used classification techniques for imbalanced datasets.

Author supplied keywords

Cite

CITATION STYLE

APA

Pourhabib, A. (2020). Empirical similarity for absent data generation in imbalanced classification. In Lecture Notes in Networks and Systems (Vol. 69, pp. 1010–1030). Springer. https://doi.org/10.1007/978-3-030-12388-8_70

Empirical similarity for absent data generation in imbalanced classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions