Due to the limited availability of parallel data in many languages, we propose a methodology that benefits from comparable corpora to find translation equivalents for collocations (as a specific type of difficult-to-translate multi-word expressions). Finding translations is known to be more difficult for collocations than for words. We propose a method based on bilingual context extraction and build a word (distributional) representation model drawing on these bilingual contexts (bilingual English-Spanish contexts in our case). We show that the bilingual context construction is effective for the task of translation equivalent learning and that our method outperforms a simplified distributional similarity baseline in finding translation equivalents.
CITATION STYLE
Taslimipoor, S., Mitkov, R., Pastor, G. C., & Fazly, A. (2018). Bilingual contexts from comparable corpora to mine for translations of collocations. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9624 LNCS, pp. 115–126). Springer Verlag. https://doi.org/10.1007/978-3-319-75487-1_10
Mendeley helps you to discover research relevant for your work.