CNNBiF: CNN-based Bigram Features for Named Entity Recognition

2Citations
Citations of this article
45Readers
Mendeley users who have this article in their library.

Abstract

Transformer models fine-tuned with a sequence labeling objective have become the dominant choice for named entity recognition tasks. However, a self-attention mechanism with unconstrained length can fail to fully capture local dependencies, particularly when training data is limited. In this paper, we propose a novel joint training objective which better captures the semantics of words corresponding to the same entity. By augmenting the training objective with a group-consistency loss component we enhance our ability to capture local dependencies while still enjoying the advantages of the unconstrained self-attention mechanism. On the CoNLL2003 dataset, our method achieves a test F1 of 93.98 with a single transformer model. More importantly our fine-tuned CoNLL2003 model displays significant gains in generalization to out of domain datasets: on the OntoNotes subset we achieve an F1 of 72.67 which is 0.49 points absolute better than the baseline, and on the WNUT16 set an F1 of 68.22 which is a gain of 0.48 points. Furthermore, on the WNUT17 dataset we achieve an F1 of 55.85, yielding a 2.92 point absolute improvement.

Cite

CITATION STYLE

APA

Sung, C., Goel, V., Marcheret, E., Rennie, S. J., & Nahamoo, D. (2021). CNNBiF: CNN-based Bigram Features for Named Entity Recognition. In Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021 (pp. 1016–1021). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-emnlp.87

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free