Revisiting Pretraining with Adapters

11Citations
Citations of this article
55Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Pretrained language models have served as the backbone for many state-of-the-art NLP results. These models are large and expensive to train. Recent work suggests that continued pretraining on task-specific data is worth the effort as pretraining leads to improved performance on downstream tasks. We explore alternatives to full-scale task-specific pretraining of language models through the use of adapter modules, a parameter-efficient approach to transfer learning. We find that adapter-based pretraining is able to achieve comparable results to task-specific pretraining while using a fraction of the overall trainable parameters. We further explore direct use of adapters without pretraining and find that the direct fine-tuning performs mostly on par with pretrained adapter models, contradicting previously proposed benefits of continual pretraining in full pretraining fine-tuning strategies. Lastly, we perform an ablation study on task-adaptive pretraining to investigate how different hyperparameter settings can change the effectiveness of the pretraining.

Cite

CITATION STYLE

APA

Kim, S., Shum, A., Susanj, N., & Hilgart, J. (2021). Revisiting Pretraining with Adapters. In RepL4NLP 2021 - 6th Workshop on Representation Learning for NLP, Proceedings of the Workshop (pp. 90–99). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.repl4nlp-1.11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free