EXTIRP: Baseline retrieval from Wikipedia

2Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The Wikipedia XML documents are considered an interesting challenge to any XML retrieval system that is capable of indexing and retrieving XML without prior knowledge of the structure. Although the structure of the Wikipedia XML documents is highly irregular and thus unpredictable, EXTIRP manages to handle all the well-formed XML documents without problems. Whether the high flexibility of EXTIRP also implies high performance concerning the quality of IR has so far been a question without definite answers. The initial results do not confirm any positive answers, but instead, they tempt us to define some requirements for the XML documents that EXTIRP is expected to index. The most interesting question stemming from our results is about the line between high-quality XML markup which aids accurate IR and noisy "XML spam" that misleads flexible XML search engines. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Lehtonen, M., & Doucet, A. (2007). EXTIRP: Baseline retrieval from Wikipedia. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4518 LNCS, pp. 115–120). Springer Verlag. https://doi.org/10.1007/978-3-540-73888-6_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free