The emergence of online social networks provided users with an easy way to publish and disseminate content, reaching broader audiences than previous platforms (such as blogs or personal websites) allowed. However, malicious users started to take advantage of these features to disseminate unreliable content through the network like false information, extremely biased opinions, or hate speech. Consequently, it becomes crucial to try to detect these users at an early stage to avoid the propagation of unreliable content in social networks’ ecosystems. In this work, we introduce a methodology to extract large corpus of unreliable posts using Twitter and two databases of unreliable websites (OpenSources and Media Bias Fact Check). In addition, we present an analysis of the content and users that publish and share several types of unreliable content. Finally, we develop supervised models to classify a twitter account according to its reliability. The experiments conducted using two different data sets show performance above 94% using Decision Trees as the learning algorithm. These experiments, although with some limitations, provide some encouraging results for future research on detecting unreliable accounts on social networks.
CITATION STYLE
Guimaraes, N., Figueira, A., & Torgo, L. (2020). Analysis and Detection of Unreliable Users in Twitter: Two Case Studies. In Communications in Computer and Information Science (Vol. 1222 CCIS, pp. 50–73). Springer. https://doi.org/10.1007/978-3-030-49559-6_3
Mendeley helps you to discover research relevant for your work.