In this paper we focus on overcoming a common belief that accurate subject classification of text documents must involve high dimensional feature vectors. We study the fastText algorithm in terms of its ability to find and extract well distinguishable characteristics for a text corpora. In research we compare the achieved accuracy in the task of subject classification with various size of feature space selected. Finally, we attempt to discover the foundation behind fastText’s well performance.
CITATION STYLE
Walkowiak, T., Datko, S., & Maciejewski, H. (2020). Low-Dimensional Classification of Text Documents. In Advances in Intelligent Systems and Computing (Vol. 987, pp. 534–543). Springer Verlag. https://doi.org/10.1007/978-3-030-19501-4_53
Mendeley helps you to discover research relevant for your work.