An important usage of natural language processing is creating vector representations of documents as features in a classification task. The traditional bag-of-word approach uses one-hot vector representations of words that aggregate into sparse vector document representation. This representation can be enhanced by weighting words that contribute the most to a classification task. In this paper, we propose a generalization of the Bi-Normal Separation metric that enhances vector representations of documents and outperforms TF-IDF scaling algorithms for one-of-m classification tasks.
CITATION STYLE
Baillargeon, J. T., Lamontagne, L., & Marceau, É. (2019). Weighting Words Using Bi-Normal Separation for Text Classification Tasks with Multiple Classes. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11489 LNAI, pp. 433–439). Springer Verlag. https://doi.org/10.1007/978-3-030-18305-9_41
Mendeley helps you to discover research relevant for your work.