Krimping texts for better summarization

Marina Litvak; Natalia Vanetik; Mark Last

Conference ProceedingsOPEN ACCESS

Krimping texts for better summarization

Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing (2015) 1931-1935

DOI: 10.18653/v1/d15-1223

8Citations

94Readers

Abstract

Automated text summarization is aimed at extracting essential information from original text and presenting it in a minimal, often predefined, number of words. In this paper, we introduce a new approach for unsupervised extractive summarization, based on the Minimum Description Length (MDL) principle, using the Krimp dataset compression algorithm (Vreeken et al., 2011). Our approach represents a text as a transactional dataset, with sentences as transactions, and then describes it by itemsets that stand for frequent sequences of words. The summary is then compiled from sentences that compress (and as such, best describe) the document. The problem of summarization is reduced to the maximal coverage, following the assumption that a summary that best describes the original text, should cover most of the word sequences describing the document. We solve it by a greedy algorithm and present the evaluation results.

Cite

CITATION STYLE

APA

Litvak, M., Vanetik, N., & Last, M. (2015). Krimping texts for better summarization. In Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing (pp. 1931–1935). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d15-1223

Krimping texts for better summarization

Abstract

Cite

Register to see more suggestions