Large-scale micro-blog authorship attribution: Beyond simple feature engineering

Thiago Cavalcante; Anderson Rocha; Ariadne Carvalho

Conference ProceedingsOPEN ACCESS

Large-scale micro-blog authorship attribution: Beyond simple feature engineering

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8827 399-407

DOI: 10.1007/978-3-319-12568-8_49

2Citations

17Readers

Abstract

With the ever-growing use of social media, authorship attribution plays an important role in avoiding cybercrime, and helping the analysis of online trails left behind by cyber pranks, stalkers, bullies, identity thieves and alike. In this paper, we propose a method for authorship attribution in micro-blogs with efficiency one hundred to a thousand times faster than state-of-the-art counterparts. The method relies on a powerful and scalable feature representation approach taking advantage of user patterns in micro-blog messages, and also on a custom-tailored pattern classifier adapted to deal with big data and high-dimensional data. Finally, we discuss search-space reduction when analyzing hundreds of online suspects and millions of online micro messages, which makes this approach invaluable for digital forensics and law enforcement.

Author supplied keywords

Cite

CITATION STYLE

APA

Cavalcante, T., Rocha, A., & Carvalho, A. (2014). Large-scale micro-blog authorship attribution: Beyond simple feature engineering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8827, pp. 399–407). Springer Verlag. https://doi.org/10.1007/978-3-319-12568-8_49

Large-scale micro-blog authorship attribution: Beyond simple feature engineering

Abstract

Author supplied keywords

Cite

Register to see more suggestions