Complete syntactic N-grams as style markers for authorship attribution

18Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper we present an authorship attribution method based on the use of complete (non-continuous, with bifurcations) syntactic n-grams as style markers. Syntactic n-grams are obtained by following paths in subtrees of a syntactic tree. We work with relatively short text fragments and build authors’ profiles of various sizes using tf-idf scheme. We train SVM classifier to perform the task. We compare the method with the application of character n-grams and show that the accuracy increases when using complete syntactic n-grams.

Cite

CITATION STYLE

APA

Posadas-Duran, J. P., Sidorov, G., & Batyrshin, I. (2014). Complete syntactic N-grams as style markers for authorship attribution. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8856, 9–17. https://doi.org/10.1007/978-3-319-13647-9_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free