Large Ensemble Averaging

  • Horn D
  • Naftaly U
  • Intrator N
N/ACitations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Averaging over many predictors leads to a reduction of the variance portion of the error. We present a method for evaluating the mean squared error of an infinite ensemble of predictors from finite (small size) ensemble information. We demonstrate it on ensembles of networks with different initial choices of synaptic weights. We find that the optimal stopping criterion for large ensembles occurs later in training time than for single networks. We test our method on the suspots data set and obtain excellent results. 6.1 Introduction Ensemble averaging has been proposed in the literature as a means to improve the generalization properties of a neural network predictor[3, 11, 7]. We follow this line of thought and consider averaging over a set of networks that differ from one another just by the initial values of their synaptic weights. We introduce a method to extract the performance of large ensembles from that of finite size ones. This is explained in the next section, and is demonstrated on the sunspots data set. Ensemble averaging over the initial conditions of the neural networks leads to a lower prediction error, which is obtained for a later training time than that expected from single networks. Our method outperforms the best published results for the sunspots problem [6]. The theoretical setting of the method is provided by the bias/variance decomposition. Within this framework, we define a particular bias/variance decomposition for networks differing by their initial conditions only. While the bias of the ensemble of networks with different initial conditions remains unchanged, the variance error decreases considerably. 6.2 Extrapolation to Large-Ensemble Averages The training procedure of neural networks starts out with some choice of initial values of the connection weights. We consider ensembles of networks that differ from one another just by their initial values and average over them. Since the

Cite

CITATION STYLE

APA

Horn, D., Naftaly, U., & Intrator, N. (1998). Large Ensemble Averaging (pp. 133–139). https://doi.org/10.1007/3-540-49430-8_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free