Harvesting for full-text retrieval

Fabio Simeoni; Murat Yakici; Steve Neely; Fabio Crestani

Conference Proceedings

Harvesting for full-text retrieval

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2005) 3815 LNCS 204-213

DOI: 10.1007/11599517_24

0Citations

6Readers

Get full text

Abstract

We propose an approach to Distributed Information Retrieval based on the periodic and incremental centralisation of full-text indices of widely dispersed and autonomously managed content sources. Inspired by the success of the Open Archive Initiative's protocol for metadata harvesting, the approach occupies middle ground between: (i) the crawling of content, and (ii) the distribution of retrieval. As in crawling, some data moves towards the retrieval process, but it is statistics about the content rather than content itself. As in distributed retrieval, some processing is distributed along with the data, but it is indexing rather than retrieval itself. We show that the approach retains the good properties of centralised retrieval without renouncing to cost-effective resource pooling. We discuss the requirements associated with the approach and identify two strategies to deploy it on top of the OAI infrastructure. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Simeoni, F., Yakici, M., Neely, S., & Crestani, F. (2005). Harvesting for full-text retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3815 LNCS, pp. 204–213). https://doi.org/10.1007/11599517_24

Harvesting for full-text retrieval

Abstract

Cite

Register to see more suggestions