Spotting of keyword directly in run-length compressed documents

3Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

With the rapid growth of digital libraries, e-governance and Internet applications, huge volume of documents are being generated, communicated and archived in the compressed form to provide better storage and transfer efficiencies. In such a large repository of compressed documents, the frequently used operations like keyword searching and document retrieval have to be carried out after decompression and subsequently with the help of an OCR. Therefore developing keyword spotting technique directly in compressed documents is a potential and challenging research issue. In this backdrop, the paper presents a novel approach for searching keywords directly in run-length compressed documents without going through the stages of decompression and OCRing. The proposed method extracts simple and straightforward font size invariant features like number of run transitions and correlation of runs over the selected regions of test words, and matches with that of the user queried word. In the subsequent step, based on the matching score, the keywords are spotted in the compressed document. The idea of decompression-less and OCR-less word spotting directly in compressed documents is the major contribution of this paper. The method is experimented on a data set of compressed documents and the preliminary results obtained validate the proposed idea.

Cite

CITATION STYLE

APA

Javed, M., Nagabhushan, P., & Chaudhuri, B. B. (2017). Spotting of keyword directly in run-length compressed documents. In Advances in Intelligent Systems and Computing (Vol. 459 AISC, pp. 367–376). Springer Verlag. https://doi.org/10.1007/978-981-10-2104-6_33

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free