A graph-based approach to topic clustering for online comments to news

27Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper investigates graph-based approaches to labeled topic clustering of reader comments in online news. For graph-based clustering we propose a linear regression model of similarity between the graph nodes (comments) based on similarity features and weights trained using automatically derived training data. To label the clusters our graph-based approach makes use of DBPedia to abstract topics extracted from the clusters. We evaluate the clustering approach against gold standard data created by human annotators and compare its results against LDA – currently reported as the best method for the news comment clustering task. Evaluation of cluster labelling is set up as a retrieval task, where human annotators are asked to identify the best cluster given a cluster label. Our clustering approach significantly outperforms the LDA baseline and our evaluation of abstract cluster labels shows that graph-based approaches are a promising method of creating labeled clusters of news comments, although we still find cases where the automatically generated abstractive labels are insufficient to allow humans to correctly associate a label with its cluster.

Cite

CITATION STYLE

APA

Aker, A., Kurtic, E., Balamurali, A. R., Paramita, M., Barker, E., Hepple, M., & Gaizauskas, R. (2016). A graph-based approach to topic clustering for online comments to news. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9626, pp. 15–29). Springer Verlag. https://doi.org/10.1007/978-3-319-30671-1_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free