A Thorough Evaluation of Task-Specific Pretraining for Summarization

Sascha Rothe; Joshua Maynez; Shashi Narayan

Conference ProceedingsOPEN ACCESS

A Thorough Evaluation of Task-Specific Pretraining for Summarization

EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings (2021) 140-145

DOI: 10.18653/v1/2021.emnlp-main.12

27Citations

76Readers

Abstract

Task-agnostic pretraining objectives like masked language models or corrupted span prediction are applicable to a wide range of NLP downstream tasks (Raffel et al., 2019), but are outperformed by task-specific pretraining objectives like predicting extracted gap sentences on summarization (Zhang et al., 2020). We compare three summarization specific pretraining objectives with the task agnostic corrupted span prediction pretraining in a controlled study. We also extend our study to a low resource and zero shot setup, to understand how many training examples are needed in order to ablate the task-specific pretraining without quality loss. Our results show that task-agnostic pretraining is sufficient for most cases which hopefully reduces the need for costly task-specific pretraining. We also report new state-of-the-art number for two summarization tasks using a T5 model with 11 billion parameters and an optimal beam search length penalty.

Cite

CITATION STYLE

APA

Rothe, S., Maynez, J., & Narayan, S. (2021). A Thorough Evaluation of Task-Specific Pretraining for Summarization. In EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 140–145). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.emnlp-main.12

A Thorough Evaluation of Task-Specific Pretraining for Summarization

Abstract

Cite

Register to see more suggestions