Group 4 compressed document matching

Dar Shyang Lee; Jonathan J. Hull

Conference ProceedingsOPEN ACCESS

Group 4 compressed document matching

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (1999) 1655 13-21

DOI: 10.1007/3-540-48172-9_2

0Citations

2Readers

Abstract

Numerous approaches, including textual, structural and featural, for detecting duplicate documents have been investigated. Considering document images are usually stored and transmitted in compressed forms, it is advantageous to perform document matching directly on the compressed data. A two-stage process for matching Group 4 compressed document images is presented. In the coarse matching stage, ranked hypotheses axe generated based on compression bit profile correlations. These candidates are further evaluated using a feature set similar to the pass codes. Multiple descriptors based on local arrangement of the feature points axe constructed for efficient indexing into the database. Performance of the algorithm on the UW database is discussed.

Cite

CITATION STYLE

APA

Lee, D. S., & Hull, J. J. (1999). Group 4 compressed document matching. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1655, pp. 13–21). Springer Verlag. https://doi.org/10.1007/3-540-48172-9_2

Group 4 compressed document matching

Abstract

Cite

Register to see more suggestions