Authorship attribution and verification with many authors and limited data

Kim Luyckx; Walter Daelemans

Conference Proceedings

Authorship attribution and verification with many authors and limited data

Belgian/Netherlands Artificial Intelligence Conference (2008) 335-336

DOI: 10.3115/1599081.1599146

4Citations

208Readers

Get full text

Abstract

Most studies in statistical or machine learning based authorship attribution focus on two or a few authors. This leads to an overestimation of the importance of the features extracted from the training data and found to be discriminating for these small sets of authors. Most studies also use sizes of training data that are unrealistic for most situations in which stylometry is applied (e.g., forensics), and thereby overestimate the accuracy of their approach in these situations. In this paper, we show, on the basis of a new corpus with 145 different authors, what the effect is of many authors on feature selection and learning, and show robustness of a memory-based learning approach in doing authorship attribution and verification with many authors and limited training data when compared to eager learners.

Cite

CITATION STYLE

APA

Luyckx, K., & Daelemans, W. (2008). Authorship attribution and verification with many authors and limited data. In Belgian/Netherlands Artificial Intelligence Conference (pp. 335–336). https://doi.org/10.3115/1599081.1599146

Authorship attribution and verification with many authors and limited data

Abstract

Cite

Register to see more suggestions