ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-shot Generalization

9Citations
Citations of this article
94Readers
Mendeley users who have this article in their library.

Abstract

We propose a multitask pretraining approach ZeroPrompt for zero-shot generalization, focusing on task scaling and zero-shot prompting. While previous models are trained on only a few dozen tasks, we scale to 1,000 tasks for the first time using real-world data. This leads to a crucial discovery that task scaling can be an efficient alternative to model scaling; i.e., the model size has less impact on performance with an extremely large number of tasks. Our results show that on the datasets we consider, task scaling can improve training efficiency by 30 times in FLOPs. Empirically, ZeroPrompt substantially improves both the efficiency and the performance of zero-shot learning across a variety of academic and production datasets.

Cite

CITATION STYLE

APA

Xu, H., Chen, Y., Du, Y., Shao, N., Wang, Y., Li, H., & Yang, Z. (2022). ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-shot Generalization. In Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 4264–4281). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-emnlp.312

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free