Viewing all models as 'probabilistic'

21Citations
Citations of this article
21Readers
Mendeley users who have this article in their library.

Abstract

In order to apply the Minimum Description Length Principle, one must associate each model in the model class under consideration with a corresponding code. For probabilistic model classes, there is a principled and generally agreed-upon method for doing this; for non-probabilistic model classes (i.e. classes of functions together with associated error functions) it is not so clear how to do this. Here, we present a new method for probabilistic and non-probabilistic model classes alike. Our method can be re-interpreted as mapping arbitrary model classes to associated classes of probability distributions. The method can therefore also be applied in a Bayesian context. In contrast to earlier proposals by Barron, Yamanishi and Rissanen and to the ad-hoc solutions found in applications of MDL, our method involves learning the optimal scaling factor in the mapping from models to codes/probability distributions from the data at hand. We show that this method satisfies several optimality properties. We present several theorems that suggest that with the help of our mapping of models to codes, one can successfully learn using MDL and/or Bayesian methods when (1) almost arbitrary model classes and error functions are allowed, and (2) none of the models in the class under consideration are close to the 'truth' that generates the data.

Cite

CITATION STYLE

APA

Grunwald, P. (1999). Viewing all models as “probabilistic.” In Proceedings of the Annual ACM Conference on Computational Learning Theory (pp. 171–182). ACM. https://doi.org/10.1145/307400.307436

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free