Application of a new set of pseudo-distances in documents categorization

1Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

Automatic text classification is a very important task that consists in assigning labels (categories, groups, classes) to a given text based on a set of previously labeled texts called training set. The work presented in this paper treats the problem of automatic topical text categorization. It is a supervised classification because it works on a predefined set of classes and topical because it uses topics or subjects of texts as classes. In this context, we used a new approach based on k-NN algorithm, as well as a new set of pseudo-distances (distance metrics) known in the field of language identification. We also proposed a simple and effective method to improve the quality of performed categorization.

Cite

CITATION STYLE

APA

Gadri, S., & Moussaoui, A. (2017). Application of a new set of pseudo-distances in documents categorization. Neural Network World, 27(2), 231–245. https://doi.org/10.14311/NNW.2017.27.011

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free