Abstract
We extend the word2vec framework to capture meaning across languages. The input consists of a source text and a word-aligned parallel text in a second language. The joint word2vec tool then represents words in both languages within a common “semantic” vector space. The result can be used to enrich lexicons of under-resourced languages, to identify ambiguities, and to perform clustering and classification. Experiments were conducted on a parallel English-Arabic corpus, as well as on English and Hebrew Biblical texts.
Cite
CITATION STYLE
Wolf, L., Hanani, Y., Bar, K., & Dershowitz, N. (2014). Joint word2vec Networks for Bilingual Semantic Representations. International Journal of Computational Linguistics and Applications, 5(1), 27--44. Retrieved from http://www.cs.tau.ac.il/~nachumd/papers/jw2v.pdf
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.