Mixed language arabic-english information retrieval

6Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

For many non-English languages in developing countries (such as Arabic), text switching/mixing (e.g. between Arabic and English) is very prevalent, especially in scientific domains, due to the fact that most technical terms are borrowed from English and/or they are neither included in the native (non- English) languages nor have a precise translation/transliteration in these native languages. This makes it difficult to search only in a non-English (native) language because either non-English-speaking users, such as Arabic speakers, are not able to express terminology in their native languages or the concepts need to be expanded using context. This results in mixed queries and documents in the non-English speaking world (the Arabic world in particular). Mixedlanguage querying is a challenging problem and does not attained major attention in IR community. Current search engines and traditional CLIR systems did not handle mixed-language querying adequately and did not exploit this natural human tendency. This paper attempts to address the problem of mixed querying in CLIR. It proposes mixed-language (language-aware) IR solution, in terms of cross-lingual re-weighting model, in which mixed queries are used to retrieve most relevant documents, regardless of their languages. For the purpose of conducting the experiments, a new multilingual and mixed Arabic-English corpus on the computer science domain is therefore created. Test results showed that the proposed cross-lingual re-weighting model could yield statistically significant better results, with respect to mixed-language queries and it achieved more than 94% of monolingual baseline effectiveness.

Cite

CITATION STYLE

APA

Mustafa, M., & Suleman, H. (2015). Mixed language arabic-english information retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9042, pp. 427–447). Springer Verlag. https://doi.org/10.1007/978-3-319-18117-2_32

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free