Abstract
This paper describes and evaluates an unsupervised and effective authorship verification model called Spatium-L1. As features, we suggest using the 200 most frequent terms of the disputed text (isolated words and punctuation symbols). Applying a simple distance measure and a set of impostors, we can determine whether or not the disputed text was written by the proposed author. Moreover, based on a simple rule we can define when there is enough evidence to propose an answer or when the attribution scheme is unable to make a decision with a high degree of certainty. Evaluations based on 6 test collections (PAN CLEF 2014 evaluation campaign) indicate that Spatium-L1 usually appears in the top 3 best verification systems, and on an aggregate measure, presents the best performance. The suggested strategy can be adapted without any problem to different Indo-European languages (such as English, Dutch, Spanish, and Greek) or genres (essay, novel, review, and newspaper article).
Author supplied keywords
Cite
CITATION STYLE
Kocher, M., & Savoy, J. (2017). A simple and efficient algorithm for authorship verification. Journal of the Association for Information Science and Technology, 68(1), 259–269. https://doi.org/10.1002/asi.23648
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.