Named entity recognition in Russian with word representation learned by a bidirectional language model

4Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Named Entity Recognition is one of the most popular tasks of the natural language processing. Pre-trained word embeddings learned from unlabeled text have become a standard component of neural network architectures for natural language processing tasks. However, in most cases, a recurrent network that operates on word-level representations to produce context sensitive representations is trained on relatively few labeled data. Also, there are many difficulties in processing Russian language. In this paper, we present a semi-supervised approach for adding deep contextualized word representation that models both complex characteristics of word usage (e.g., syntax and semantics), and how these usages vary across linguistic contexts (i.e., to model polysemy). Here word vectors are learned functions of the internal states of a deep bidirectional language model, which is pretrained on a large text corpus. We show that these representations can be easily added to existing models and be combined with other word representation features. We evaluate our model on FactRuEval-2016 dataset for named entity recognition in Russian and achieve state of the art results.

Cite

CITATION STYLE

APA

Konoplich, G., Putin, E., Filchenkov, A., & Rybka, R. (2018). Named entity recognition in Russian with word representation learned by a bidirectional language model. In Communications in Computer and Information Science (Vol. 930, pp. 48–58). Springer Verlag. https://doi.org/10.1007/978-3-030-01204-5_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free