Abstract
The aim of modern authorship attribution approaches is to analyze known authors and to assign authorships to previously unseen and unlabeled text documents based on various features. In this paper we present a novel feature to enhance current attribution methods by analyzing the grammar of authors. To extract the feature, a syntax tree of each sentence of a document is calculated, which is then split up into length-independent patterns using pq-grams. The mostly used pq-grams are then used to compose sample profiles of authors that are compared with the profile of the unlabeled document by utilizing various distance metrics and similarity scores. An evaluation using three different and independent data sets reveals promising results and indicate that the grammar of authors is a significant feature to enhance modern authorship attribution methods.
Cite
CITATION STYLE
Tschuggnall, M., & Specht, G. (2014). Enhancing Authorship Attribution by Utilizing Syntax Tree Profiles. In EACL 2014 - 14th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference (pp. 195–199). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/e14-4038
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.