Named Entity Recognition via Noise Aware Training Mechanism with Data Filter

13Citations
Citations of this article
58Readers
Mendeley users who have this article in their library.

Abstract

Named entity recognition (NER) is a fundamental task in natural language processing, these is a long held belief that datasets benefit the model. However, not all the data help with generalization, and some samples may contain ambiguous entities or noisy labels. The existing methods can not distinguish hard samples from noisy samples well, and becomes particularly challenging in the case of overfitting. This paper proposes a new method called Noise-Aware-with-Filter (NAF) to solve the issues from two sides. From the perspective of the data, we design a Logit-Maximum-Difference (LMD) mechanism, which maximizes the diversity between different samples to help the model identify noisy samples. From the perspective of the model, we design an Incomplete-Trust (In-trust) loss function, which boosts LCRF with a robust Distrust-Cross-Entropy(DCE) term. Our proposed Intrust can effectively alleviate the overfitting caused by previous loss function. Experiments on six real-world Chinese and English NER datasets show that NAF outperforms the previous methods, and which obtained the state-ofthe-art(SOTA) results on the CoNLL2003 and CoNLL++ datasets.

Cite

CITATION STYLE

APA

Huang, X., Chen, Y., Wu, S., Zhao, J., Xie, Y., & Sun, W. (2021). Named Entity Recognition via Noise Aware Training Mechanism with Data Filter. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 4791–4803). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.423

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free