Keyword extraction approaches based on directed graph representation of text mostly use word positions in the sentences. A preceding word points to a succeeding word or vice versa in a window of N consecutive words in the text. The accuracy of this approach is dependent on the number of active voice and passive voice sentences in the given text. Edge direction can only be applied by considering the entire text as a single unit leaving no importance for the sentences in the document. Otherwise words at the initial or ending positions in each sentence will get less connections/recommendations. In this paper we propose a directed graph representation technique (Thematic text graph) in which weighted edges are drawn between the words based on the theme of the document. Keyword weights are identified from the Thematic text graph using an existing centrality measure and the resulting weights are used for computing the importance of sentences in the document. Experiments conducted on the benchmark data sets SemEval-2010 and DUC 2002 data sets shown that the proposed keyword weighting model is effective and facilitates an improvement in the quality of system generated extractive summaries.
CITATION STYLE
V.V. Ravinuthala, M. K., & Reddy Ch., S. (2016). Thematic Text Graph: A Text Representation Technique for Keyword Weighting in Extractive Summarization System. International Journal of Information Engineering and Electronic Business, 8(4), 18–25. https://doi.org/10.5815/ijieeb.2016.04.03
Mendeley helps you to discover research relevant for your work.