Weighted compositional vectors for translating collocations using monolingual corpora

2Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents a method to automatically identify bilingual equivalents of collocations using only monolingual corpora in two languages. The method takes advantage of cross-lingual distributional semantics models mapped into a shared vector space, and of compositional methods to find appropriate translations of non-congruent collocations (e.g., pay attention–prestar atenção in English–Portuguese). This strategy is evaluated in the translation of English–Portuguese and English–Spanish collocations belonging to two syntactic patterns: adjective-noun and verb-object, and compared to other methods proposed in the literature. The results of the experiments performed show that the compositional approach, based on a weighted additive model, behaves better than the other strategies that have been evaluated, and that both the asymmetry and the compositional properties of collocations are captured by the combined vector representations. This paper also contributes with two freely available gold-standard data sets which are useful to evaluate the performance of automatic extraction of multilingual equivalents of collocations.

Cite

CITATION STYLE

APA

Garcia, M., García-Salido, M., & Alonso-Ramos, M. (2019). Weighted compositional vectors for translating collocations using monolingual corpora. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11755 LNAI, pp. 113–128). Springer. https://doi.org/10.1007/978-3-030-30135-4_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free