Multi-word expressions annotations effect in document classification task

Dhekra Najar; Slim Mesfar; Henda Ben Ghezela

Conference Proceedings

Multi-word expressions annotations effect in document classification task

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 10859 LNCS 238-246

DOI: 10.1007/978-3-319-91947-8_23

1Citations

2Readers

Get full text

Abstract

Document classification is a necessary task for most Natural Language Processing tools since it classifies documents content in a helpful and meaningful way. The main concern in this paper is to investigate the impact of using multi-words for text representation on the performances of text classification task. Two text classification strategies are proposed to observe the robustness of each of them. First, we will deal with the literature review of existing linguistic resources in Arabic language. Secondly, we will present a classification method that is based on domain candidate simple terms. These terms are automatically extracted from multiple specialized corpora depending on their appearance frequency. Then, we will present a detailed description of a classification method based on multi-word expressions dictionary. CompounDic, an Arabic multi-word expressions dictionary, will be used to automatically annotate multi-word expressions and variations in text. Finally, we carried out a series of experiments on classifying specialized text based on simple words and multi-word expressions for comparison purposes. Our experiments show that the use of multi-word expressions annotations enhances the text classification results.

Author supplied keywords

Cite

CITATION STYLE

APA

Najar, D., Mesfar, S., & Ghezela, H. B. (2018). Multi-word expressions annotations effect in document classification task. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10859 LNCS, pp. 238–246). Springer Verlag. https://doi.org/10.1007/978-3-319-91947-8_23

Multi-word expressions annotations effect in document classification task

Abstract

Author supplied keywords

Cite

Register to see more suggestions