Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training

58Citations
Citations of this article
119Readers
Mendeley users who have this article in their library.

Abstract

We study the problem of training named entity recognition (NER) models using only distantly-labeled data, which can be automatically obtained by matching entity mentions in the raw text with entity types in a knowledge base. The biggest challenge of distantly-supervised NER is that the distant supervision may induce incomplete and noisy labels, rendering the straightforward application of supervised learning ineffective. In this paper, we propose (1) a noise-robust learning scheme comprised of a new loss function and a noisy label removal step, for training NER models on distantly-labeled data, and (2) a self-training method that uses contextualized augmentations created by pre-trained language models to improve the generalization ability of the NER model. On three benchmark datasets, our method achieves superior performance, outperforming existing distantly-supervised NER models by significant margins.

Cite

CITATION STYLE

APA

Meng, Y., Zhang, Y., Huang, J., Wang, X., Zhang, Y., Ji, H., & Han, J. (2021). Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training. In EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 10367–10378). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.emnlp-main.810

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free