Contextualized embeddings in named-entity recognition: An empirical study on generalization

Bruno Taillé; Vincent Guigue; Patrick Gallinari

Conference ProceedingsOPEN ACCESS

Contextualized embeddings in named-entity recognition: An empirical study on generalization

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12036 LNCS 383-391

DOI: 10.1007/978-3-030-45442-5_48

17Citations

25Readers

Abstract

Contextualized embeddings use unsupervised language model pretraining to compute word representations depending on their context. This is intuitively useful for generalization, especially in Named-Entity Recognition where it is crucial to detect mentions never seen during training. However, standard English benchmarks overestimate the importance of lexical over contextual features because of an unrealistic lexical overlap between train and test mentions. In this paper, we perform an empirical analysis of the generalization capabilities of state-of-the-art contextualized embeddings by separating mentions by novelty and with out-of-domain evaluation. We show that they are particularly beneficial for unseen mentions detection, especially out-of-domain. For models trained on CoNLL03, language model contextualization leads to a +1.2% maximal relative micro-F1 score increase in-domain against +13% out-of-domain on the WNUT dataset (The code is available at https://github.com/btaille/contener).

Author supplied keywords

Cite

CITATION STYLE

APA

Taillé, B., Guigue, V., & Gallinari, P. (2020). Contextualized embeddings in named-entity recognition: An empirical study on generalization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12036 LNCS, pp. 383–391). Springer. https://doi.org/10.1007/978-3-030-45442-5_48

Contextualized embeddings in named-entity recognition: An empirical study on generalization

Abstract

Author supplied keywords

Cite

Register to see more suggestions