Shuffled-token Detection for Refining Pre-trained RoBERTa

Subhadarshi Panda; Anjali Agrawal; Jeewon Ha; Benjamin Bloch

Conference Proceedings

Shuffled-token Detection for Refining Pre-trained RoBERTa

NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Student Research Workshop (2021) 88-93

DOI: 10.18653/v1/2021.naacl-srw.12

9Citations

53Readers

Get full text

Abstract

State-of-the-art transformer models have achieved robust performance on a variety of NLP tasks. Many of these approaches have employed domain agnostic pre-training tasks to train models that yield highly generalized sentence representations that can be fine-tuned for specific downstream tasks. We propose refining a pre-trained NLP model using the objective of detecting shuffled tokens. We use a sequential approach by starting with the pre-trained RoBERTa model and training it using our approach. Applying random shuffling strategy on the word-level, we found that our approach enables the RoBERTa model achieve better performance on 4 out of 7 GLUE tasks. Our results indicate that learning to detect shuffled tokens is a promising approach to learn more coherent sentence representations.1

Cite

CITATION STYLE

APA

Panda, S., Agrawal, A., Ha, J., & Bloch, B. (2021). Shuffled-token Detection for Refining Pre-trained RoBERTa. In NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Student Research Workshop (pp. 88–93). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.naacl-srw.12

Shuffled-token Detection for Refining Pre-trained RoBERTa

Abstract

Cite

Register to see more suggestions