Improving Contextual Representation with Gloss Regularized Pre-training

Yu Lin; Zhecheng An; Peihao Wu; Zejun Ma

Conference ProceedingsOPEN ACCESS

Improving Contextual Representation with Gloss Regularized Pre-training

Lin Y
An Z
Wu P
et al.

Findings of the Association for Computational Linguistics: NAACL 2022 - Findings (2022) 907-920

DOI: 10.18653/v1/2022.findings-naacl.68

5Citations

39Readers

Abstract

Though achieving impressive results on many NLP tasks, the BERT-like masked language models (MLM) encounter the discrepancy between pre-training and inference. In light of this gap, we investigate the contextual representation of pre-training and inference from the perspective of word probability distribution. We discover that BERT risks neglecting the contextual word similarity in pre-training. To tackle this issue, we propose an auxiliary gloss regularizer module to BERT pre-training (GRBERT), to enhance word semantic similarity. By predicting masked words and aligning contextual embeddings to corresponding glosses simultaneously, the word similarity can be explicitly modeled. We design two architectures for GR-BERT and evaluate our model in downstream tasks. Experimental results show that the gloss regularizer benefits BERT in wordlevel and sentence-level semantic representation. The GR-BERT achieves new state-of-theart in lexical substitution task and greatly promotes BERT sentence representation in both unsupervised and supervised STS tasks.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Lin, Y., An, Z., Wu, P., & Ma, Z. (2022). Improving Contextual Representation with Gloss Regularized Pre-training. In Findings of the Association for Computational Linguistics: NAACL 2022 - Findings (pp. 907–920). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-naacl.68

Readers' Seniority

PhD / Post grad / Masters / Doc 8

62%

Researcher 4

31%

Lecturer / Post doc 1

Readers' Discipline

Computer Science 12

67%

Linguistics 4

22%

Neuroscience 1

Engineering 1

Improving Contextual Representation with Gloss Regularized Pre-training

Abstract

References Powered by Scopus

Spanbert: Improving pre-training by representing and predicting spans

SemEval-2016 task 1: Semantic textual similarity, monolingual and cross-lingual evaluation

How contextual are contextualized word representations? Comparing the geometry of BERT, ELMO, and GPT-2 embeddings

Cited by Powered by Scopus

ParaLS: Lexical Substitution via Pretrained Paraphraser

Multilingual Lexical Simplification via Paraphrase Generation

Reversible source-aware natural language watermarking via customized lexical substitution

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline