Keyword searching for Arabic handwritten documents

  • Saabni R
  • El-Sana J
N/ACitations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

In this paper we present a system for searching keywords in Arabic hand-written and historical documents using two algorithms, Dynamic Time Warping (DTW) and Hidden Markov Models (HMM). The HMM based system provides satisfying results when it is possible to provide adequate training samples (which is not always possible in historical documents). The DTW algorithm with a slight modification provides better results even with a small set of training sam-ples. The observation sequences for the matching algorithms are generated by extracting a set of geometric features that already shown to obtain good recog-nition rates for on-line Arabic handwriting. We have adopted the segmentation-free approach, i.e., continuous word-parts are used as the basic alphabet, instead of the usual alphabet letters. The contours of the complete word-parts are used to represent the shapes of the compared word-parts. Additional strokes, such as dots and detached short segments, which are very common in Arabic scripts, are used via a rule-based system to improve the search algorithm and determine the final comparison decision. The search for a keyword is performed by the search for its word-parts, including the additional strokes, in the right order. The results for our modi?ed DTW algorithm are very encouraging, even when using a small set of samples for training.

Cite

CITATION STYLE

APA

Saabni, R., & El-Sana, J. (2008). Keyword searching for Arabic handwritten documents. The, 1–14. Retrieved from http://www.cs.bgu.ac.il/~saabni/Docs/KeyWordSearch.pdf

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free