Stochastic Normalizations as Bayesian Learning

10Citations
Citations of this article
22Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this work we investigate the reasons why Batch Normalization (BN) improves the generalization performance of deep networks. We argue that one major reason, distinguishing it from data-independent normalization methods, is randomness of batch statistics. This randomness appears in the parameters rather than in activations and admits an interpretation as a practical Bayesian learning. We apply this idea to other (deterministic) normalization techniques that are oblivious to the batch size. We show that their generalization performance can be improved significantly by Bayesian learning of the same form. We obtain test performance comparable to BN and, at the same time, better validation losses suitable for subsequent output uncertainty estimation through approximate Bayesian posterior.

Cite

CITATION STYLE

APA

Shekhovtsov, A., & Flach, B. (2019). Stochastic Normalizations as Bayesian Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11362 LNCS, pp. 463–479). Springer Verlag. https://doi.org/10.1007/978-3-030-20890-5_30

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free