Utilizing DTRS for imbalanced text classification

Bing Zhou; Yiyu Yao; Qingzhong Liu

Conference Proceedings

Utilizing DTRS for imbalanced text classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9920 LNAI 219-228

DOI: 10.1007/978-3-319-47160-0_20

0Citations

7Readers

Get full text

Abstract

Imbalanced data classification is one of the challenging problems in data mining and machine learning research. The traditional classification algorithms are often biased towards the majority class when learning from imbalanced data. Much work have been proposed to address this problem, including data re-sampling, algorithm modification, and cost-sensitive learning. However, most of them focus on one of these techniques. This paper proposes to utilize both algorithm modification and cost-sensitive learning based on decision-theoretic rough set (DTRS) model. In particular, we use naive Bayes classifier as the base classifier and modify it for imbalanced learning. For cost-sensitive learning, we adopt the systematic method from DTRS to derive required thresholds that have the minimum decision cost. Our experimental results on three well-known text classification databases show that unified DTRS provides similar performance on balanced class distribution, outperforms naive Bayes classifier on imbalanced datasets, and is competitive with other imbalanced learning classifier.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhou, B., Yao, Y., & Liu, Q. (2016). Utilizing DTRS for imbalanced text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9920 LNAI, pp. 219–228). Springer Verlag. https://doi.org/10.1007/978-3-319-47160-0_20

Utilizing DTRS for imbalanced text classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions