Topic Identification and Categorization of Public Information in Community-Based Social Media

N/ACitations
Citations of this article
16Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This paper presents a work on a semi-supervised method for topic identification and classification of short texts in the social media, and its application on tweets containing dialogues in a large community of dwellers in a city, written mostly in Indonesian. These dialogues comprise a wealth of information about the city, shared in real-time. We found that despite the high irregularity of the language used, and the scarcity of suitable linguistic resources, a meaningful identification of topics could be performed by clustering the tweets using the K-Means algorithm. The resulting clusters are found to be robust enough to be the basis of a classification. On three grouping schemes derived from the clusters, we get accuracy of 95.52%, 95.51%, and 96.7 using linear SVMs, reflecting the applicability of applying this method for generating topic identification and classification on such data.

Cite

CITATION STYLE

APA

Kusumawardani, R. P., & Basri, M. H. (2017). Topic Identification and Categorization of Public Information in Community-Based Social Media. In Journal of Physics: Conference Series (Vol. 801). Institute of Physics Publishing. https://doi.org/10.1088/1742-6596/801/1/012075

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free