When the training data in a two-class classification problem is overwhelmed by one class, most classification techniques fail to correctly identify the data points belonging to the underrepresented class. This paper proposes Similarity-based Imbalanced Classification (SBIC) that simultaneously optimizes the weights of the empirical similarity function and identifies the locations of absent data points, i.e. unobserved data points from the minority class. Similar to cost-sensitive approaches, SBIC operates on an algorithmic level to handle imbalanced structures and similar to synthetic data generation approaches, it utilizes the properties of unobserved data points. The main contribution of the paper is to show that a similarity function can be used to tackle imbalanced datasets. The results of applying the proposed method to imbalanced datasets suggests that SBIC is comparable to, and in some cases outperforms, other commonly used classification techniques for imbalanced datasets.
CITATION STYLE
Pourhabib, A. (2020). Empirical similarity for absent data generation in imbalanced classification. In Lecture Notes in Networks and Systems (Vol. 69, pp. 1010–1030). Springer. https://doi.org/10.1007/978-3-030-12388-8_70
Mendeley helps you to discover research relevant for your work.