This paper describes an approach based on the use of Google News as a source of information in order to generate a learning corpus for an information filtering task. The INFILE (INformation FILtering Evaluation) track of the CLEF (Cross-Lingual Evaluation Forum) 2009 campaign has been used as framework. The information filtering task can be seen as a document classification task, so a supervised learning scheme has been followed. Two learning corpora have been proved: one using the text of the topics as learning data to train a classifier, and another one where training data have been generated from Google News pages, using the keywords of topics as queries. Results show that the use of Google News for generating learning data does not improve the results obtained using only topic descriptions as learning corpora. © 2010 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Montejo-Ráez, A., Perea-Ortega, J. M., Díaz-Galiano, M. C., & Ureña-López, L. A. (2010). Experiments with google news for filtering newswire articles. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6241 LNCS, pp. 381–384). https://doi.org/10.1007/978-3-642-15754-7_46
Mendeley helps you to discover research relevant for your work.