A new criterion for selecting models from partially observed data

  • Shimodaira H
N/ACitations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A new criterion PDIO (predictive divergence for indirect observation models) is proposed for selecting statistical models from partially observed data. PDIO is devised for " indirect observation models " , in which observations are only available indirectly through random variables. That is, some underlying hidden structure is assumed to generate the manifest variables. For example, unsupervised learning recognition systems, clustering, latent structure analysis, mixture distribution mod-els, missing data, noisy observations, etc., or the models whose maximum likelihood estimator is based on the EM (expectation-maximization) algorithm. PDIO is a natural extension of AIC (Akaike's information criterion), and the two criteria are equivalent when direct observations are available. Both criteria are expressed as the sum of two terms: the first term represents the goodness of fit of the model to the observed data, and the second term represents the model complexity. The goodness of fit terms are equivalent in both criteria, but the complexity terms are different. The complexity term is a function of model structure and the number of samples and is added in order to take into account the reliability of the observed data. A mean fluctuation of the estimated true distribution is used as the model complexity in PDIO. The relative relation of the " model manifold " and the " observed mani-fold " is, therefore, reflected in the complexity term of PDIO from the information geometric point of view, whereas it reduces to the number of parameters in AIC. PDIO is very unique in dealing with the unobservable underlying structure " posi-tively. " In this paper the generalized expression of PDIO is shown using two Fisher information matrices. An approximated computation method for PDIO is also pre-sented utilizing EM iterates. Some computer simulations are shown to demonstrate how this criterion works.

Cite

CITATION STYLE

APA

Shimodaira, H. (1994). A new criterion for selecting models from partially observed data (pp. 21–29). https://doi.org/10.1007/978-1-4612-2660-4_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free