Advancing the Use of Information Compression Distances in Authorship Attribution

2Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Detecting unreliable information in social media is an open challenge, in part as a result of the difficulty to associate a piece of information to known and trustworthy actors. The identification of the origin of sources can help society deal with unverified, incomplete, or even false information. In this work we tackle the problem of associating a piece of information to a certain politician. The use of inaccurate information is of great relevance in the case of politicians, since it affects social perception and voting behavior. Moreover, misquotation can be weaponized to hinder adversary reputation. We consider the task of applying a compression-based metric to conduct authorship attribution in social media, namely in Twitter. In specific, we leverage the Normalized Compression Distance (NCD) to compare an author’s text with other authors’ texts. We show that this methodology performs well, obtaining 80.3% accuracy in a scenario with 6 different politicians.

Cite

CITATION STYLE

APA

Muñoz, S. P., Oliva, C., Lago-Fernández, L. F., & Arroyo, D. (2022). Advancing the Use of Information Compression Distances in Authorship Attribution. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13545 LNCS, pp. 114–122). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-18253-2_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free