Supervised classification using balanced training

Mian Du; Matthew Pierce; Lidia Pivovarova; Roman Yangarber

Journal Article

Supervised classification using balanced training

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8791 147-158

DOI: 10.1007/978-3-319-11397-5_11

8Citations

5Readers

Get full text

Abstract

We examine supervised learning for multi-class, multi-label text classification. We are interested in exploring classification in a realworld setting, where the distribution of labels may change dynamically over time. First, we compare the performance of an array of binary classifiers trained on the label distribution found in the original corpus against classifiers trained on balanced data, where we try to make the label distribution as nearly uniform as possible.We discuss the performance trade-offs between balanced vs. unbalanced training, and highlight the advantages of balancing the training set. Second, we compare the performance of two classifiers, Naive Bayes and SVM, with several feature-selection methods, using balanced training. We combine a Named-Entity-based rote classifier with the statistical classifiers to obtain better performance than either method alone.

Author supplied keywords

Cite

CITATION STYLE

APA

Du, M., Pierce, M., Pivovarova, L., & Yangarber, R. (2014). Supervised classification using balanced training. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8791, 147–158. https://doi.org/10.1007/978-3-319-11397-5_11

Supervised classification using balanced training

Abstract

Author supplied keywords

Cite

Register to see more suggestions