In Twitter, the user send their opinion in small messages called as tweets. In this paper, machine learning techniques have been implemented for tweet categorization. The machine learning techniques contains different classification models and have various performance measures. In this proposed work, a simple and workable approach is identified for predicting the category of the tweets and aims to investigate the machine learning techniques on real time Twitter data. Here, the three techniques are used such as Naïve Bayes classifier, LinearSVC and MultinomialNB. Before applying the classifier, a simple method is used for pre-processing called as term frequency-inverse document frequency. It is used for tweet classification to get the weight score as the feature vector. This feature extraction method TF-IDF used to identify the most frequent words in the tweets. The dataset that has been collected from Twitter streaming API for each topic which consists of English tweets called as proposed corpus. Based on the accuracy, the performance measures of tweet classification has been calculated. Finally, the results have shown that MutinomialNB has performed better experimentally compared to the other two different techniques by obtaining 79% of accuracy.
CITATION STYLE
Vadivukarassi, M., Puviarasan, N., & Aruna, P. (2019). A Comparison of Supervised Machine Learning Approaches for Categorized Tweets. In Lecture Notes on Data Engineering and Communications Technologies (Vol. 26, pp. 422–430). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-03146-6_47
Mendeley helps you to discover research relevant for your work.