Predicting Discourse Trees from Transformer-based Neural Summarizers

13Citations
Citations of this article
66Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Previous work indicates that discourse information benefits summarization. In this paper, we explore whether this synergy between discourse and summarization is bidirectional, by inferring document-level discourse trees from pre-trained neural summarizers. In particular, we generate unlabeled RST-style discourse trees from the self-attention matrices of the transformer model. Experiments across models and datasets reveal that the summarizer learns both, dependency- and constituency-style discourse information, which is typically encoded in a single head, covering long- and short-distance discourse dependencies. Overall, the experimental results suggest that the learned discourse information is general and transferable inter-domain.

Cite

CITATION STYLE

APA

Xiao, W., Huber, P., & Carenini, G. (2021). Predicting Discourse Trees from Transformer-based Neural Summarizers. In NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference (pp. 4139–4152). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.naacl-main.436

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free