Hashtag processing for enhanced clustering of tweets

6Citations
Citations of this article
77Readers
Mendeley users who have this article in their library.

Abstract

Rich data provided by tweets have been analyzed, clustered, and explored in a variety of studies. Typically those studies focus on named entity recognition, entity linking, and entity disambiguation or clustering. Tweets and hashtags are generally analyzed on sentential or word level but not on a compositional level of concatenated words. We propose an approach for a closer analysis of compounds in hash-tags, and in the long run also of other types of text sequences in tweets, in order to enhance the clustering of such text documents. Hashtags have been used before as primary topic indicators to cluster tweets, however, their segmentation and its effect on clustering results have not been investigated to the best of our knowledge. Our results with a standard dataset from the Text REtrieval Conference (TREC) show that segmented and harmonized hashtags positively impact effective clustering.

Cite

CITATION STYLE

APA

Gromann, D., & Declerck, T. (2017). Hashtag processing for enhanced clustering of tweets. In International Conference Recent Advances in Natural Language Processing, RANLP (Vol. 2017-September, pp. 277–283). Incoma Ltd. https://doi.org/10.26615/978-954-452-049-6_038

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free