An enhanced Arabic OCR degraded text retrieval model

Mostafa Ezzat; Tarek ElGhazaly; Mervat Gheith

Conference Proceedings

An enhanced Arabic OCR degraded text retrieval model

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 8265 LNAI(PART 1) 380-393

DOI: 10.1007/978-3-642-45114-0_31

2Citations

3Readers

Get full text

Abstract

This paper provides a new model enhancing the Arabic OCR degraded text retrieval effectiveness. The proposed model based on simulating the Arabic OCR recognition mistakes on a word based approach. Then the model expands the user search query using the expected OCR errors. The resulting expanded search query gives higher precision and recall in searching Arabic OCR-Degraded text rather than the original query. The proposed new model showed a significant increase in the degraded text retrieval effectiveness over the previous models. The retrieval effectiveness of the new model is %97, while the best effectiveness published for word based approach was %84 and the best effectiveness for character based approach was %56. In addition, the new model overcomes several limitations of the current two existing models. © Springer-Verlag 2013.

Author supplied keywords

Cite

CITATION STYLE

APA

Ezzat, M., ElGhazaly, T., & Gheith, M. (2013). An enhanced Arabic OCR degraded text retrieval model. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8265 LNAI, pp. 380–393). https://doi.org/10.1007/978-3-642-45114-0_31

An enhanced Arabic OCR degraded text retrieval model

Abstract

Author supplied keywords

Cite

Register to see more suggestions