Autonomous news clustering and classification for an intelligent Web portal

3Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The paper presents an autonomous text classification module for a news web portal for the Romanian language. Statistical natural language processing techniques are combined in order to achieve a completely autonomous functionality of the portal. The news items are automatically collected from a large number of news sources using web syndication. Afterward, machine-learning techniques are used for achieving an automatic classification of the news stream. Firstly, the items are clustered using an agglomerative algorithm and the resulting groups correspond to the main news topics. Thus, more information about each of the main topics is acquired from various news sources. Secondly, text classification algorithms are applied to automatically label each cluster of news items in a predetermined number of classes. More than a thousand news items were employed for both the training and the evaluation of the classifiers. The paper presents a complete comparison of the results obtained for each method. © 2008 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Rebedea, T., & Trausan-Matu, S. (2008). Autonomous news clustering and classification for an intelligent Web portal. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4994 LNAI, pp. 477–486). Springer Verlag. https://doi.org/10.1007/978-3-540-68123-6_52

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free