This article presents the general Wikipedia XML Collection developped for Structured Information Retrieval and Structured Machine Learning. This collection has been built from the Wikipedia Enclyclopedia. We detail particularly here which parts of this collection have been used during INEX 2006 for the Ad-hoc track and for the XML Mining track. Note that other tracks of INEX - multimedia track for example - have also been based on this collection. © Springer-Verlag Berlin Heidelberg 2007.
CITATION STYLE
Denoyer, L., & Gallinari, P. (2007). The wikipedia XML corpus. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4518 LNCS, pp. 12–19). Springer Verlag. https://doi.org/10.1007/978-3-540-73888-6_2
Mendeley helps you to discover research relevant for your work.