Numerous approaches, including textual, structural and featural, for detecting duplicate documents have been investigated. Considering document images are usually stored and transmitted in compressed forms, it is advantageous to perform document matching directly on the compressed data. A two-stage process for matching Group 4 compressed document images is presented. In the coarse matching stage, ranked hypotheses axe generated based on compression bit profile correlations. These candidates are further evaluated using a feature set similar to the pass codes. Multiple descriptors based on local arrangement of the feature points axe constructed for efficient indexing into the database. Performance of the algorithm on the UW database is discussed.
CITATION STYLE
Lee, D. S., & Hull, J. J. (1999). Group 4 compressed document matching. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1655, pp. 13–21). Springer Verlag. https://doi.org/10.1007/3-540-48172-9_2
Mendeley helps you to discover research relevant for your work.