Model selection is an important and ubiquitous task in machine learning. To select models with the best future classification performance measured by a goal metric, an evaluation metric is often used to select the best classification model among the competing ones. A common practice is to use the same goal and evaluation metric. However, in several recent studies, it is claimed that using an evaluation metric (such as AUC) other than the goal metric (such as accuracy) results in better selection of the correct models. In this paper, we point out a flaw in the experimental design of those studies, and propose an improved method to test the claim. Our extensive experiments show convincingly that only the goal metric itself can most reliably select the correct classification models. © 2008 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Huang, J., Ling, C. X., Zhang, H., & Matwin, S. (2008). Proper model selection with significance test. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5211 LNAI, pp. 536–547). https://doi.org/10.1007/978-3-540-87479-9_53
Mendeley helps you to discover research relevant for your work.