Query translation and expansion for searching normal and OCR-Degraded arabic text

Tarek Elghazaly; Aly Fahmy

Conference Proceedings

Query translation and expansion for searching normal and OCR-Degraded arabic text

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5449 LNCS 481-497

DOI: 10.1007/978-3-642-00382-0_39

2Citations

3Readers

Get full text

Abstract

This paper provides a novel model for English/Arabic Query Translation to search Arabic text, and then expands the Arabic query to handle Arabic OCR-Degraded Text. This includes detection and translation of word collocations, translating single words, transliterating names, and disambiguating translation and transliteration through different approaches. It also expands the query with the expected OCR-Errors that are generated from the Arabic OCRErrors simulation model which proposed inside the paper. The query translation and expansion model has been supported by different libraries proposed in the paper like a Word Collocations Dictionary, Single Words Dictionaries, a Modern Arabic corpus, and other tools. The model gives high accuracy in translating the Queries from English to Arabic solving the translation and transliteration ambiguities and with orthographic query expansion; it gives high degree of accuracy in handling OCR errors. © Springer-Verlag Berlin Heidelberg 2009.

Author supplied keywords

Cite

CITATION STYLE

APA

Elghazaly, T., & Fahmy, A. (2009). Query translation and expansion for searching normal and OCR-Degraded arabic text. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5449 LNCS, pp. 481–497). https://doi.org/10.1007/978-3-642-00382-0_39

Query translation and expansion for searching normal and OCR-Degraded arabic text

Abstract

Author supplied keywords

Cite

Register to see more suggestions