Features based approach for indexation and representation of unstructured Arabic documents

Mohamed Salim El Bazzi; Driss Mammass; Abdelatif Ennaji; Taher Zaki

Journal ArticleOPEN ACCESS

Features based approach for indexation and representation of unstructured Arabic documents

Advances in Science, Technology and Engineering Systems (2017) 2(3) 900-905

DOI: 10.25046/aj0203112

2Citations

6Readers

Abstract

The increase of textual information published in Arabic language on the internet, public libraries and administrations requires implementing effective techniques for the extraction of relevant information contained in large corpus of texts. The purpose of indexing is to create a document representation that easily find and identify the relevant information in a set of documents. However, mining textual data is becoming a complicated task, especially when taking semantic into consideration. In this paper, we will present an indexation system based on contextual representation that will take the advantage of semantic links given in a document. Our approach is based on the extraction of keyphrases. Then, each document is represented by its relevant keyphrases instead of its simple keywords. The experimental results confirms the effectiveness of our approach.

Author supplied keywords

Cite

CITATION STYLE

APA

El Bazzi, M. S., Mammass, D., Ennaji, A., & Zaki, T. (2017). Features based approach for indexation and representation of unstructured Arabic documents. Advances in Science, Technology and Engineering Systems, 2(3), 900–905. https://doi.org/10.25046/aj0203112

Features based approach for indexation and representation of unstructured Arabic documents

Abstract

Author supplied keywords

Cite

Register to see more suggestions