A Thorough Evaluation of Task-Specific Pretraining for Summarization

27Citations
Citations of this article
76Readers
Mendeley users who have this article in their library.

Abstract

Task-agnostic pretraining objectives like masked language models or corrupted span prediction are applicable to a wide range of NLP downstream tasks (Raffel et al., 2019), but are outperformed by task-specific pretraining objectives like predicting extracted gap sentences on summarization (Zhang et al., 2020). We compare three summarization specific pretraining objectives with the task agnostic corrupted span prediction pretraining in a controlled study. We also extend our study to a low resource and zero shot setup, to understand how many training examples are needed in order to ablate the task-specific pretraining without quality loss. Our results show that task-agnostic pretraining is sufficient for most cases which hopefully reduces the need for costly task-specific pretraining. We also report new state-of-the-art number for two summarization tasks using a T5 model with 11 billion parameters and an optimal beam search length penalty.

Cite

CITATION STYLE

APA

Rothe, S., Maynez, J., & Narayan, S. (2021). A Thorough Evaluation of Task-Specific Pretraining for Summarization. In EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 140–145). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.emnlp-main.12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free