Authorship Attribution of Brazilian Literary Texts Through Machine Learning Techniques

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Authorship attribution is the process of identifying the author of a particular document. This task has been performed by experts in the field. However, with the advancement of natural language processing tools and machine learning techniques, this activity has also been performed by computer systems. Authorship attribution has applicability from the detection of plagiarism and copyright to the resolution of forensic problems. There are several works on this subject in the English idiom, however those that consider texts in Portuguese are few. Therefore, this paper aims to study authorship attribution of texts of Brazilian literature. We carried out our experiments using Naïve Bayes and Random Forests methods, and for the feature extraction we considered Term Frequency - Inverse Document Frequency and Part of Speech techniques. The results showed that the Random Forests using as input the textual features extracted by Part of Speech presented the best cross-validation accuracy, although not the best runtime.

Cite

CITATION STYLE

APA

da Rocha Bartolomei, B., & Drummond, I. N. (2020). Authorship Attribution of Brazilian Literary Texts Through Machine Learning Techniques. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12319 LNAI, pp. 389–402). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-61377-8_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free