Sarcasm is primary reason behind the faulty classification of the tweets. The tweets of sarcastic nature appear in the different compositions, but mainly deflect the meaning different than their actual composition. This confuses the classification models and produces false results. In the paper, the primary focus remains upon the classification of sarcastic tweets, which has been accomplished using the textual structure. This involves the expressions of speech, part of speech features, punctuations, term sentiment, affection, etc. All of the features are extracted individually from the target tweet and combined altogether to create the cumulative feature for the target tweet. The proposed model has been observed with accuracy slightly higher than 84%, which depicts the clear improvement in comparison with existing models. The random forest-based classification model has outperformed all other candidates deployed under the experiment. The random forest classifier is observed with accuracy of 84.7, which outperforms the SVM (78.6%), KNN (73.1%), and Maximum entropy (80.5%).
CITATION STYLE
Kumar, R., & Kaur, J. (2020). Random forest-based sarcastic tweet classification using multiple feature collection. In Intelligent Systems Reference Library (Vol. 163, pp. 131–160). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-13-8759-3_5
Mendeley helps you to discover research relevant for your work.