Low-Dimensional Classification of Text Documents

Tomasz Walkowiak; Szymon Datko; Henryk Maciejewski

Conference Proceedings

Low-Dimensional Classification of Text Documents

Walkowiak T
Datko S
Maciejewski H

Advances in Intelligent Systems and Computing (2020) 987 534-543

DOI: 10.1007/978-3-030-19501-4_53

1Citations

3Readers

Get full text

Abstract

In this paper we focus on overcoming a common belief that accurate subject classification of text documents must involve high dimensional feature vectors. We study the fastText algorithm in terms of its ability to find and extract well distinguishable characteristics for a text corpora. In research we compare the achieved accuracy in the task of subject classification with various size of feature space selected. Finally, we attempt to discover the foundation behind fastText’s well performance.

Author supplied keywords

Feature extraction
Subject classification
Text mining
Word embedding
fastText

Cite

CITATION STYLE

APA

Walkowiak, T., Datko, S., & Maciejewski, H. (2020). Low-Dimensional Classification of Text Documents. In Advances in Intelligent Systems and Computing (Vol. 987, pp. 534–543). Springer Verlag. https://doi.org/10.1007/978-3-030-19501-4_53

Low-Dimensional Classification of Text Documents

Abstract

Author supplied keywords

Cite

Register to see more suggestions