Supervised classification using balanced training

8Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We examine supervised learning for multi-class, multi-label text classification. We are interested in exploring classification in a realworld setting, where the distribution of labels may change dynamically over time. First, we compare the performance of an array of binary classifiers trained on the label distribution found in the original corpus against classifiers trained on balanced data, where we try to make the label distribution as nearly uniform as possible.We discuss the performance trade-offs between balanced vs. unbalanced training, and highlight the advantages of balancing the training set. Second, we compare the performance of two classifiers, Naive Bayes and SVM, with several feature-selection methods, using balanced training. We combine a Named-Entity-based rote classifier with the statistical classifiers to obtain better performance than either method alone.

Cite

CITATION STYLE

APA

Du, M., Pierce, M., Pivovarova, L., & Yangarber, R. (2014). Supervised classification using balanced training. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8791, 147–158. https://doi.org/10.1007/978-3-319-11397-5_11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free