Instance spaces for machine learning classification

Mario A. Muñoz; Laura Villanova; Davaatseren Baatar; Kate Smith-Miles

Journal ArticleOPEN ACCESS

Instance spaces for machine learning classification

Machine Learning (2018) 107(1) 109-147

DOI: 10.1007/s10994-017-5629-5

97Citations

97Readers

Abstract

This paper tackles the issue of objective performance evaluation of machine learning classifiers, and the impact of the choice of test instances. Given that statistical properties or features of a dataset affect the difficulty of an instance for particular classification algorithms, we examine the diversity and quality of the UCI repository of test instances used by most machine learning researchers. We show how an instance space can be visualized, with each classification dataset represented as a point in the space. The instance space is constructed to reveal pockets of hard and easy instances, and enables the strengths and weaknesses of individual classifiers to be identified. Finally, we propose a methodology to generate new test instances with the aim of enriching the diversity of the instance space, enabling potentially greater insights than can be afforded by the current UCI repository.

Author supplied keywords

Cite

CITATION STYLE

APA

Muñoz, M. A., Villanova, L., Baatar, D., & Smith-Miles, K. (2018). Instance spaces for machine learning classification. Machine Learning, 107(1), 109–147. https://doi.org/10.1007/s10994-017-5629-5

Instance spaces for machine learning classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions