The goal of model-order selection is to select a model variant that generalizes best from training data to unseen test data. In unsupervised learning without any labels, the computation of the generalization error of a solution poses a conceptual problem which we address in this paper. We formulate the principle of "minimum transfer costs" for model-order selection. This principle renders the concept of cross-validation applicable to unsupervised learning problems. As a substitute for labels, we introduce a mapping between objects of the training set to objects of the test set enabling the transfer of training solutions. Our method is explained and investigated by applying it to well-known problems such as singular-value decomposition, correlation clustering, Gaussian mixture-models, and k-means clustering. Our principle finds the optimal model complexity in controlled experiments and in real-world problems such as image denoising, role mining and detection of misconfigurations in access-control data. © 2011 Springer-Verlag.
CITATION STYLE
Frank, M., Chehreghani, M. H., & Buhmann, J. M. (2011). The minimum transfer cost principle for model-order selection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6911 LNAI, pp. 423–438). https://doi.org/10.1007/978-3-642-23780-5_37
Mendeley helps you to discover research relevant for your work.