Trolling on social media is the phenomenon of using provocative or offensive text, attempts to dominate, disrupt or deviate from the main topic of discussion. Identifying trolls can help protect organic users of the platform from the unwanted negative consequences resulting from interacting with a troll. In this work, five condensed feature sets namely sentiment, readability, post analysis, network and frequency analysis are used to make the broad distinction between troll and non-troll users. An ensemble of Machine Learning Algorithms (with base classifiers as Random Forest, Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM) and meta-classifier as Random Forest) are used to perform the multilevel classification. In the first level, trolls are identified from non-trolls and in the second level, the trolls are classified into their respective types—Political, Communal, Conspiracy or Asocial Trolls. Additionally, by data driven observations, the traditional understanding of antisocial behavior in trolls is expanded to develop a more multidimensional representation of trolling behavior. Using the Stacking Classifier, an accuracy of 78.72% was achieved for identifying trolls from non-trolls in first phase and an accuracy of 83.24% in classifying trolls into their respective categories in the second phase.
CITATION STYLE
Mathew, S. K., Alex, D., Deshpande, N., Sharma, R., Arya, A., & Balendra, D. P. (2024). Multilevel Troll Classification of Twitter Data Using Machine Learning Techniques. International Journal of Computer Theory and Engineering, 16(1), 21–28. https://doi.org/10.7763/IJCTE.2024.V16.1350
Mendeley helps you to discover research relevant for your work.