Bilingual word vectors have been exploited a lot in cross-language information retrieval research. However, most of the research is currently focused on similar language pairs. There are very few studies exploring the impact of using bilingual word vectors for cross-language information retrieval in long-distance language pairs. In this paper, it systematically analyzes the retrieval performance of various European languages (English, German, Italian, French, Finnish, Dutch) as well as Asian languages (Chinese, Japanese) in the adhoc task of CLEF 2002–2003 campaign. Genetic proximity was used to visually represent the relationships between languages and compare their cross-lingual retrieval performance in various settings. The results show that the differences in language vocabulary would dramatically affect the retrieval performance. At the same time, the term by term translation retrieval method performs slightly better than the simple vector addition retrieval methods. It proves that the translation-based retrieval model can still maintain its advantage under the new semantic scheme.
CITATION STYLE
Li, Y., & Zhou, D. (2019). Research on Cross-Language Retrieval Using Bilingual Word Vectors in Different Languages. In Communications in Computer and Information Science (Vol. 1058, pp. 454–465). Springer Verlag. https://doi.org/10.1007/978-981-15-0118-0_35
Mendeley helps you to discover research relevant for your work.