Information divergence geometry and the application to statistical machine learning

15Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This chapter presents intuitive understandings for statistical learning from an information geometric point of view. We discuss a wide class of information divergence indices that express quantitatively a departure between any two probability density functions. In general, the information divergence leads to a statistical method by minimization which is based on the empirical data available. We discuss the association between the information divergence and a Riemannian metric and a pair of conjugate linear connections for a family of probability density functions. The most familiar example is the Kullback-Leibler divergence, which leads to the maximum likelihood method associated with the information metric and the pair of the exponential and mixture connections. For the class of statistical methods obtained by minimizing the divergence we discuss statistical properties focusing on its robustness. As applications to statistical learning we discuss the minimum divergence method for the principal component analysis, independent component analysis and for statistical pattern recognition. © 2009 Springer US.

Cite

CITATION STYLE

APA

Eguchi, S. (2009). Information divergence geometry and the application to statistical machine learning. In Information Theory and Statistical Learning (pp. 309–332). Springer US. https://doi.org/10.1007/978-0-387-84816-7_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free