HMM based keyword spotting system in printed/handwritten arabic/latin documents with identification stage

1Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we propose a novel script independent approach for word spotting in printed and handwritten multi-script documents. Since each writing type and script needs to be processed using a specific spotting engine, the proposed system proceeds on two stages: First, a one-step identification method of the writing type and the script of the input image document. The identification system is based on HMM and does not need any additional resources (training, preprocessing, feature extraction) besides those used in the spotting step. Second, a specific word spotting method is used to detect any given keyword in document images. The proposed spotting system is lexicon-free, i.e., able to spot arbitrary keywords that are not required to be known at the training stage. The global system has been evaluated on mixed corpus of public databases such as KHATT, KAFD for Arabic script and ALTID, RIMES for Latin script. The experimental results on both document-level writing type and script type identification and keyword spotting confirm the effectiveness of the proposed scheme.

Cite

CITATION STYLE

APA

Cheikh Rouhou, A., Kessentini, Y., & Kanoun, S. (2019). HMM based keyword spotting system in printed/handwritten arabic/latin documents with identification stage. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11662 LNCS, pp. 309–320). Springer Verlag. https://doi.org/10.1007/978-3-030-27202-9_28

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free