Application of a new set of pseudo-distances in documents categorization

S. Gadri; A. Moussaoui

Journal ArticleOPEN ACCESS

Application of a new set of pseudo-distances in documents categorization

Neural Network World (2017) 27(2) 231-245

DOI: 10.14311/NNW.2017.27.011

1Citations

8Readers

Abstract

Automatic text classification is a very important task that consists in assigning labels (categories, groups, classes) to a given text based on a set of previously labeled texts called training set. The work presented in this paper treats the problem of automatic topical text categorization. It is a supervised classification because it works on a predefined set of classes and topical because it uses topics or subjects of texts as classes. In this context, we used a new approach based on k-NN algorithm, as well as a new set of pseudo-distances (distance metrics) known in the field of language identification. We also proposed a simple and effective method to improve the quality of performed categorization.

Author supplied keywords

Cite

CITATION STYLE

APA

Gadri, S., & Moussaoui, A. (2017). Application of a new set of pseudo-distances in documents categorization. Neural Network World, 27(2), 231–245. https://doi.org/10.14311/NNW.2017.27.011

Application of a new set of pseudo-distances in documents categorization

Abstract

Author supplied keywords

Cite

Register to see more suggestions