Abstract
The presented paper addresses problem of evaluation of decision systems in authorship attribution domain. Two typical approaches are cross-validation and evaluation based on specially created test datasets. Sometimes preparation of test sets can be troublesome. Another problem appears when discretization of input sets is taken into account. It is not obvious how to discretize test datasets. Therefore model evaluation method not requiring test sets would be useful. Cross-validation is the well-known and broadly accepted method, so the question arose if it can deliver reliable information about quality of prepared decision system. The set of classifiers was selected and different discretization algorithms were applied to obtain method invariant outcomes. The comparative results of experiments performed using cross-validation and test sets approaches to system evaluation, and conclusions are presented.
Cite
CITATION STYLE
Baron, G. (2016). Comparison of cross-validation and test sets approaches to evaluation of classifiers in authorship attribution domain. In Communications in Computer and Information Science (Vol. 659, pp. 81–89). Springer Verlag. https://doi.org/10.1007/978-3-319-47217-1_9
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.