Span Fine-tuning for Pre-trained Language Models

1Citations
Citations of this article
57Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Pre-trained language models (PrLM) have to carefully manage input units when training on a very large text with a vocabulary consisting of millions of words. Previous works have shown that incorporating span-level information over consecutive words in pre-training could further improve the performance of PrLMs. However, given that span-level clues are introduced and fixed in pre-training, previous methods are time-consuming and lack of flexibility. To alleviate the inconvenience, this paper presents a novel span fine-tuning method for PrLMs, which facilitates the span setting to be adaptively determined by specific downstream tasks during the fine-tuning phase. In detail, any sentences processed by the PrLM will be segmented into multiple spans according to a pre-sampled dictionary. Then the segmentation information will be sent through a hierarchical CNN module together with the representation outputs of the PrLM and ultimately generate a span-enhanced representation. Experiments on GLUE benchmark show that the proposed span fine-tuning method significantly enhances the PrLM, and at the same time, offer more flexibility in an efficient way. The code is available at https://github.com/BAORONGZHOU/spanfine-tuning.

Cite

CITATION STYLE

APA

Bao, R., Zhang, Z., & Zhao, H. (2021). Span Fine-tuning for Pre-trained Language Models. In Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021 (pp. 1970–1979). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-emnlp.169

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free