User intention-based document summarization on heterogeneous sentence networks

6Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Automatic extraction-based document summarization is a difficult Natural Language Processing task. Previous approaches have usually generated the summary by extracting the top K salient sentences on graph-based ranking algorithms, but sentence feature representation only captures the surface relationship between the objects, hence the results may not accurately reflect the user’s intentions. Therefore, we propose a method to address this challenge, and: (1) obtain deeper semantic concepts among candidate sentences using meaningful sentence vectors combining word vectors and TF-IDF; (2) rank the sentences considering both relationships between sentences and the user’s intention for each sentence to identify significant sentences, and apply these to a heterogeneous graph; (3) generate the result sentence by sentence to ensure summary semantics are properly related to the original document. We verified the proposed approach experimentally using English summarization benchmark datasets DUC2001 and DUC2002; the large Chinese summarization data set, LCSTS. We also collected news data and produced a reference summary using a group of bank auditor experts that we compared to the proposed approach using ROUGE evaluation.

Cite

CITATION STYLE

APA

Wang, H. Y., Chang, J. W., & Huang, J. W. (2019). User intention-based document summarization on heterogeneous sentence networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11447 LNCS, pp. 572–587). Springer Verlag. https://doi.org/10.1007/978-3-030-18579-4_34

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free