Improving cross-topic authorship attribution: The role of pre-processing

17Citations
Citations of this article
42Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The effectiveness of character n-gram features for representing the stylistic properties of a text has been demonstrated in various independent Authorship Attribution (AA) studies. Moreover, it has been shown that some categories of character n-grams perform better than others both under single and cross-topic AA conditions. In this work, we present an improved algorithm for cross-topic AA. We demonstrate that the effectiveness of character n-grams representation can be significantly enhanced by performing simple pre-processing steps and appropriately tuning the number of features, especially in cross-topic conditions.

Cite

CITATION STYLE

APA

Markov, I., Stamatatos, E., & Sidorov, G. (2018). Improving cross-topic authorship attribution: The role of pre-processing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10762 LNCS, pp. 289–302). Springer Verlag. https://doi.org/10.1007/978-3-319-77116-8_21

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free