Reducing redundant information in search results employing approximation algorithms

5Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

It is widely accepted that there are many Web documents that contain identical or near-identical information. Modern search engines have developed duplicate detection algorithms to eliminate this problem in the search results, but difficulties still remain, mainly because the structure and the content of the results could not be changed. In this work we propose an effective methodology for removing redundant information from search results. Using previous methodologies, we extract from the search results a set of composite documents called SuperTexts and then, by applying novel approximation algorithms, we select the SuperTexts that better reduce the redundant information. The final results are next ranked according to their relevance to the initial query. We give some complexity results and experimentally evaluate the proposed algorithms. © 2014 Springer International Publishing Switzerland.

Cite

CITATION STYLE

APA

Makris, C., Plegas, Y., Stamatiou, Y. C., Stavropoulos, E. C., & Tsakalidis, A. K. (2014). Reducing redundant information in search results employing approximation algorithms. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8645 LNCS, pp. 240–247). Springer Verlag. https://doi.org/10.1007/978-3-319-10085-2_22

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free