Tikhonov training of the CMAC neural network.
- ISSN: 10459227
- DOI: 10.1109/TNN.2006.872348
- PubMed: 16722167
Abstract
The architecture of the cerebellar model articulation controller (CMAC) presents a rigid compromise between learning and generalization. In the presence of a sparse training dataset, this limitation manifestly causes overfitting, a drawback that is not overcome by current training algorithms. This paper proposes a novel training framework founded on the Tikhonov regularization, which relates to the minimization of the power of the sigma-order derivative. This smoothness criterion yields to an internal cell-interaction mechanism that increases the generalization beyond the degree hardcoded in the CMAC architecture while preserving the potential CMAC learning capabilities. The resulting training mechanism, which proves to be simple and computationally efficient, is deduced from a rigorous theoretical study. The performance of the new training framework is validated against comparative benchmarks from the DELVE environment.
Tikhonov training of the CMAC neural network.
Tikhonov Training of the CMAC Neural Network
Luis Weruaga, Associate Member, IEEE, and Barbara Kieslinger
Abstract—The architecture of the cerebellar model articula-
tion controller (CMAC) presents a rigid compromise between
learning and generalization. In the presence of a sparse training
dataset, this limitation manifestly causes overfitting, a drawback
that is not overcome by current training algorithms. This paper
proposes a novel training framework founded on the Tikhonov
regularization, which relates to the minimization of the power
of the -order derivative. This smoothness criterion yields to an
internal cell-interaction mechanism that increases the general-
ization beyond the degree hardcoded in the CMAC architecture
while preserving the potential CMAC learning capabilities. The
resulting training mechanism, which proves to be simple and com-
putationally efficient, is deduced from a rigorous theoretical study.
The performance of the new training framework is validated
against comparative benchmarks from the DELVE environment.
Index Terms—Cerebellar model articulation controller
(CMAC), generalization, overfitting, Tikhonov regularization.
I. INTRODUCTION
THE cerebellar model articulation controller (CMAC) is anonlinear adaptive system proposed by Albus in the mid-
1970s [1] with built-in simple computation, local generalization
and fast learning properties [2]. In spite of its elegant features, it
has attracted modest interest because of its rigid structure, large
memory requirements in multidimensional problems, and its in-
capability to represent an arbitrary function with any degree of
accuracy. Furthermore, the attempts to describe the theoretical
identification properties [3], [4] have not provided conclusive
results, while the alternative training algorithms [5]–[7] are es-
sentially variants of the original Albus’ rule.
In most of the feedforward neural networks (NN) the ability
to generalize, that is, to have the outputs of the net approxi-
mate target values to inputs that are not in the training set, lies
de facto in the structure: For instance, in the multilayer-percep-
tron (MLP) that ability depends on the number of hidden units
and weight initialization [8], in radial basis functions (RBF) on
the spatial spread [9], and in the CMAC the generalization de-
gree is hardcoded in the size of the hypercube. As the avail-
ability of sufficient training data is not always possible or is ex-
pensive, the ability of the NN to interpolate/extrapolate from
a sparse training set is a major concern. Improving the gen-
eralization capability of a NN has been addressed extensively
from an algorithmic point of view: The popular weight-decay
[10] (which limits the power of the NN weights), the minimiza-
tion of the power of the derivatives in the neuron output in sig-
moidal networks [11], the addition of noise to the training set
Manuscript received March 26, 2004; revised June 1, 2005.
L. Weruaga is with the Commission for Scientific Visualization, Austrian
Academy of Sciences, A-1220 Vienna, Austria (e-mail: weruaga@ieee.org).
B. Kieslinger is with the Centre for Social Innovation, A-1150 Vienna, Aus-
tria (e-mail: kieslinger@zsi.at).
Digital Object Identifier 10.1109/TNN.2006.872348
[12] (which has important theoretical implications), the defi-
nition of smoothness functionals that correspond to different
multilayer networks with one hidden layer [13], and the mini-
mization of the expected error or structural risk [14] are popular
strategies. Regarding the CMAC, the most popular training al-
gorithm, the least-mean-square (LMS) or Albus’ rule, spreads
the training error among the excited local functions. Especially
with a sparse training dataset, this approach does not explic-
itly produce a consistent output. In that situation, an increase in
generalization can only be accomplished by sacrificing CMAC
learning capabilities.
Few attempts have been documented to address CMAC
weak performance with sparse training sets [15]–[17]. The
so-called optimal weight-smoothing, proposed in [15], pursues
the minimization of the squared difference between adjacent
CMAC weights. A reduced CMAC-based structure suitable
in high-dimensional problems is proposed in [16], where it is
claimed that in several realistic problems the new CMAC-based
system keeps similar learning capabilities with regard to
the global CMAC, whereas its generalization is intrinsically
improved. In [17] an interesting support vector (SV)-based
interpretation of the CMAC is deduced: The CMAC is used
to solve a ridge-regression problem [14], which results in a
problem complexity upper bounded by the size of the training
set (and not by the CMAC size). This proposal has parallels
with [7] since both mechanisms are supported on the so-called
articulation matrix. Given that the generalization capabilities
of the “Kernel” CMAC are not improved per se, a weight
smoothing mechanism is proposed as tentative solution [17],
but its performance on realistic problems and sparse training
sets has not been sufficiently addressed.
This paper proposes a novel CMAC training algorithm
that overcomes the mentioned rigid learning-generalization
tradeoff. The improvement in generalization comes explicitly
from the minimization of the power of the -order derivative
of the CMAC output, which corresponds to the well-known
Tikhonov regularization [18]. This smoothness criterion results
in the propagation of the external data to the whole CMAC
architecture by means of an iterative linear interaction among
the CMAC weights. The paper is structured as follows: The
premise and motivation are included in Section II; Section III
presents the formulation of the Tikhonov regularization in the
CMAC structure, which results in a simple, stable gradient-de-
scent training algorithm; Section IV deals with the practical
implementation of the novel training algorithm; Section V
contains the analysis of previous related works in comparison
to the proposed regularization; results over synthetic and real-
istic scenarios are included in Section VI; Section VII raises a
discussion about the perspectives and potential benefits of this
regularization framework; the conclusions close the paper.
1045-9227/$20.00 © 2006 IEEE
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime


