Multi-document arabic text summarization based on thematic annotation

1Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Reduce document(s) by keeping keys and significant sentences from a set of data is called text summarization. It has been around for a long time in natural language processing research, it is improving over the years due to a considerable number of methods and research in this area. The paper suggests Arabic multi-document text summarization. The originality of the approach is that the summary based on thematic annotation such as input documents are analyzed and segmented using LDA. Then segments of each topic are represented by a separate graph because of the redundancy problem in multi-document summarization. In the last step, the proposed approach applies a modified pagerank algorithm that utilizes cosine similarity measure as a weight between edges. Vertices that have high scores are essential. Therefore, they construct the final summary. To evaluate summary systems, researchers develop serval metrics divided into three categories, namely: automatic, semiautomatic and manual. This study research chooses automatic evaluation methods for text summarization, mainly Rouge measure (Rouge-1, Rouge-2, Rouge-L, and Rouge-SU4).

Cite

CITATION STYLE

APA

Merniz, A., Chaibi, A. H., & Ghézala, H. H. B. (2021). Multi-document arabic text summarization based on thematic annotation. In Proceedings of the 16th International Conference on Software Technologies, ICSOFT 2021 (pp. 639–644). SciTePress. https://doi.org/10.5220/0010557906390644

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free