Although human-written summaries of documents tend to involve significant edits to the source text, most automated summarizers are extractive and select sentences verbatim. In this work we examine how elementary discourse units (EDUs) from Rhetorical Structure Theory can be used to extend extractive summarizers to produce a wider range of human-like summaries. Our analysis demonstrates that EDU segmentation is effective in preserving human-labeled summarization concepts within sentences and also aligns with near-extractive summaries constructed by news editors. Finally, we show that using EDUs as units of content selection instead of sentences leads to stronger summarization performance in near-extractive scenarios, especially under tight budgets.
CITATION STYLE
Li, J. J., Thadani, K., & Stent, A. (2016). The Role of Discourse Units in Near-Extractive Summarization. In SIGDIAL 2016 - 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Proceedings of the Conference (pp. 137–147). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w16-3617
Mendeley helps you to discover research relevant for your work.