Morpheme level word embedding

2Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Modern NLP tasks such as sentiment analysis, semantic analysis, text entity extraction and others depend on the language model quality. Language structure influences quality: a model that fits well the analytic languages for some NLP tasks, doesn’t fit well enough the synthetic languages for the same tasks. For example, a well known Word2Vec [27] model shows good results for the English language which is rather an analytic language than a synthetic one, but Word2Vec has some problems with synthetic languages due to their high inflection for some NLP tasks. Since every morpheme in synthetic languages provides some information, we propose to discuss morpheme level-model to solve different NLP tasks. We consider the Russian language in our experiments. Firstly, we describe how to build morpheme extractor from prepared vocabularies. Our extractor reached 91% accuracy on the vocabularies of known morpheme segmentation. Secondly we show the way how it can be applied for NLP tasks, and then we discuss our results, pros and cons, and our future work.

Cite

CITATION STYLE

APA

Galinsky, R., Kovalenko, T., Yakovleva, J., & Filchenkov, A. (2018). Morpheme level word embedding. In Communications in Computer and Information Science (Vol. 789, pp. 143–155). Springer Verlag. https://doi.org/10.1007/978-3-319-71746-3_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free