The hierarchical mixture of experts architecture provides a flexible procedure for implementing classification algorithms. The classification is obtained by a recursive soft partition of the feature space in a data-driven fashion. Such a procedure enables local classification where several experts are used, each of which is assigned with the task of classification over some subspace of the feature space. In this work, we provide data-dependent generalization error bounds for this class of models, which lead to effective procedures for performing model selection. Tight bounds are particularly important here, because the model is highly parameterized. The theoretical results are complemented with numerical experiments based on a randomized algorithm, which mitigates the effects of local minima which plague other approaches such as the expectation-maximization algorithm.
CITATION STYLE
Azran, A., & Meir, R. (2004). Data dependent risk bounds for hierarchical mixture of experts classifiers. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3120, pp. 427–441). Springer Verlag. https://doi.org/10.1007/978-3-540-27819-1_30
Mendeley helps you to discover research relevant for your work.