Improving web pages retrieval using combined fields

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This article describes the participation of the REINA Research Group of the University of Salamanca in WebCLEF 2006. This year we participated in the Monolingual Mixed Task in Spanish. The entire EuroGOV collection was processed to select all the pages in Spanish. All the pages with domain .es were also pre-selected. Our objective this year was to try pre-retrieval techniques of combining information fields or elements from web pages as well as the retrieval capability of these fields. In vector-based retrieval systems, the combining of terms coming from different sources can be achieved by operating on the frequency of the terms in the document using a weight scheme of tf × idf. The BODY field is, of course, the most useful from the retrieval perspective, but the text of the backlinks brings considerable improvement. META fields or tags, however, contribute little to retrieval improvement. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Figuerola, C. G., Berrocal, J. L. A., Rodríguez, Á. F. Z., & Rodríguez, E. (2007). Improving web pages retrieval using combined fields. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4730 LNCS, pp. 820–825). Springer Verlag. https://doi.org/10.1007/978-3-540-74999-8_102

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free