Comparison of cross-validation and test sets approaches to evaluation of classifiers in authorship attribution domain

Grzegorz Baron

Conference ProceedingsOPEN ACCESS

Comparison of cross-validation and test sets approaches to evaluation of classifiers in authorship attribution domain

Baron G

Communications in Computer and Information Science (2016) 659 81-89

DOI: 10.1007/978-3-319-47217-1_9

22Citations

11Readers

Abstract

The presented paper addresses problem of evaluation of decision systems in authorship attribution domain. Two typical approaches are cross-validation and evaluation based on specially created test datasets. Sometimes preparation of test sets can be troublesome. Another problem appears when discretization of input sets is taken into account. It is not obvious how to discretize test datasets. Therefore model evaluation method not requiring test sets would be useful. Cross-validation is the well-known and broadly accepted method, so the question arose if it can deliver reliable information about quality of prepared decision system. The set of classifiers was selected and different discretization algorithms were applied to obtain method invariant outcomes. The comparative results of experiments performed using cross-validation and test sets approaches to system evaluation, and conclusions are presented.

Cite

CITATION STYLE

APA

Baron, G. (2016). Comparison of cross-validation and test sets approaches to evaluation of classifiers in authorship attribution domain. In Communications in Computer and Information Science (Vol. 659, pp. 81–89). Springer Verlag. https://doi.org/10.1007/978-3-319-47217-1_9

Comparison of cross-validation and test sets approaches to evaluation of classifiers in authorship attribution domain

Abstract

Cite

Register to see more suggestions