Research on Cross-Language Retrieval Using Bilingual Word Vectors in Different Languages

Yulong Li; Dong Zhou

Conference Proceedings

Research on Cross-Language Retrieval Using Bilingual Word Vectors in Different Languages

Communications in Computer and Information Science (2019) 1058 454-465

DOI: 10.1007/978-981-15-0118-0_35

0Citations

1Readers

Get full text

Abstract

Bilingual word vectors have been exploited a lot in cross-language information retrieval research. However, most of the research is currently focused on similar language pairs. There are very few studies exploring the impact of using bilingual word vectors for cross-language information retrieval in long-distance language pairs. In this paper, it systematically analyzes the retrieval performance of various European languages (English, German, Italian, French, Finnish, Dutch) as well as Asian languages (Chinese, Japanese) in the adhoc task of CLEF 2002–2003 campaign. Genetic proximity was used to visually represent the relationships between languages and compare their cross-lingual retrieval performance in various settings. The results show that the differences in language vocabulary would dramatically affect the retrieval performance. At the same time, the term by term translation retrieval method performs slightly better than the simple vector addition retrieval methods. It proves that the translation-based retrieval model can still maintain its advantage under the new semantic scheme.

Author supplied keywords

Cite

CITATION STYLE

APA

Li, Y., & Zhou, D. (2019). Research on Cross-Language Retrieval Using Bilingual Word Vectors in Different Languages. In Communications in Computer and Information Science (Vol. 1058, pp. 454–465). Springer Verlag. https://doi.org/10.1007/978-981-15-0118-0_35

Research on Cross-Language Retrieval Using Bilingual Word Vectors in Different Languages

Abstract

Author supplied keywords

Cite

Register to see more suggestions