Compression of concordances in full-text retrieval systems

23Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

The concordance of a full-text information retrieval system contains for every different word W of the data base, a list L(W) of "coordinates", each of which describes the exact location of an occurrence of W in the text. The concordance should be compressed, not only for the savings in storage space, but also in order to reduce the number of I/O operations, since the file is usually kept in secondary memory. Several methods are presented, which efficiently compress concordances of large full-text retrieval systems. The methods were tested on the concordance of the Responsa Retrieval Project and yield savings of up to 49% relative to the non-compressed file; this is a relative improvement of about 27% over the currently used prefix-omission compression technique.

Cite

CITATION STYLE

APA

Choueka, Y., Fraenkel, A. S., & Klein, S. T. (1988). Compression of concordances in full-text retrieval systems. In Proceedings of the 11th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1988 (pp. 597–612). Association for Computing Machinery, Inc. https://doi.org/10.1145/62437.62500

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free