Noun compositionality detection using distributional semantics for the Russian language

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we present the first gold-standard corpus of Russian noun compounds annotated with compositionality information. We used Universal Dependency treebanks to collect noun compounds according to part of speech patterns, such as ADJ-NOUN or NOUN-NOUN and annotated them according to the following schema: a phrase can be either compositional, non-compositional, or ambiguous (i.e., depending on the context it can be interpreted both as compositional or non-compositional). Next, we conduct a series of experiments to evaluate both unsupervised and supervised methods for predicting compositionality. To expand this manually annotated dataset with more non-compositional compounds and streamline the annotation process we use active learning. We show that not only the methods, previously proposed for English, are easily adapted for Russian, but also can be exploited in active learning paradigm, that increases the efficiency of the annotation process.

Cite

CITATION STYLE

APA

Puzyrev, D., Shelmanov, A., Panchenko, A., & Artemova, E. (2019). Noun compositionality detection using distributional semantics for the Russian language. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11832 LNCS, pp. 218–229). Springer. https://doi.org/10.1007/978-3-030-37334-4_20

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free