Optimized binary search and text retrieval

Eduardo Fernandes Barbosa; Gonzalo Navarro; Ricardo Baeza-Yates; Chris Perleberg; Nivio Ziviani

Conference Proceedings

Optimized binary search and text retrieval

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (1995) 979 311-326

DOI: 10.1007/3-540-60313-1_152

6Citations

5Readers

Get full text

Abstract

We present an algorithm that minimizes the expected cost of indirect binary search for data with non-constant access costs, such as disk data. Indirect binary search means that sorted access to the data is obtained through an array of pointers to the raw data. One immediate application of this algorithm is to improve the retrieval performance of disk databases that are indexed using the suffix array model (also called PAT array). We consider the cost model of magnetic and optical disks and the anticipated knowledge of the expected size of the subproblem produced by reading each disk track. This information is used to devise a modified binary searching algorithm to decrease overall retrieval costs. Both an optimal and a practical algorithm are presented, together with analytical and experimental results. For 100 megabytes of text the practical algorithm costs 60% of the standard binary search cost for the magnetic disk and 65% for the optical disk.

Author supplied keywords

Cite

CITATION STYLE

APA

Barbosa, E. F., Navarro, G., Baeza-Yates, R., Perleberg, C., & Ziviani, N. (1995). Optimized binary search and text retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 979, pp. 311–326). Springer Verlag. https://doi.org/10.1007/3-540-60313-1_152

Optimized binary search and text retrieval

Abstract

Author supplied keywords

Cite

Register to see more suggestions