A Comparative Study of Pre-trained Encoders for Low-Resource Named Entity Recognition

Yuxuan Chen; Jonas Mikkelsen; Arne Binder; Christoph Alt; Leonhard Hennig

Conference ProceedingsOPEN ACCESS

A Comparative Study of Pre-trained Encoders for Low-Resource Named Entity Recognition

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2022) 46-59

DOI: 10.18653/v1/2022.repl4nlp-1.6

2Citations

47Readers

Abstract

Pre-trained language models (PLM) are effective components of few-shot named entity recognition (NER) approaches when augmented with continued pre-training on task-specific out-of-domain data or fine-tuning on in-domain data. However, their performance in low-resource scenarios, where such data is not available, remains an open question. We introduce an encoder evaluation framework, and use it to systematically compare the performance of state-of-the-art pre-trained representations on the task of low-resource NER. We analyze a wide range of encoders pre-trained with different strategies, model architectures, intermediate-task fine-tuning, and contrastive learning. Our experimental results across ten benchmark NER datasets in English and German show that encoder performance varies significantly, suggesting that the choice of encoder for a specific low-resource scenario needs to be carefully evaluated.

Cite

CITATION STYLE

APA

Chen, Y., Mikkelsen, J., Binder, A., Alt, C., & Hennig, L. (2022). A Comparative Study of Pre-trained Encoders for Low-Resource Named Entity Recognition. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 46–59). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.repl4nlp-1.6

A Comparative Study of Pre-trained Encoders for Low-Resource Named Entity Recognition

Abstract

Cite

Register to see more suggestions