In this paper we propose and investigate a novel nonlinear unit, called Lp unit, for deep neural networks. The proposed L p unit receives signals from several projections of a subset of units in the layer below and computes a normalized L p norm. We notice two interesting interpretations of the Lp unit. First, the proposed unit can be understood as a generalization of a number of conventional pooling operators such as average, root-mean-square and max pooling widely used in, for instance, convolutional neural networks (CNN), HMAX models and neocognitrons. Furthermore, the L p unit is, to a certain degree, similar to the recently proposed maxout unit [13] which achieved the state-of-the-art object recognition results on a number of benchmark datasets. Secondly, we provide a geometrical interpretation of the activation function based on which we argue that the L p unit is more efficient at representing complex, nonlinear separating boundaries. Each L p unit defines a superelliptic boundary, with its exact shape defined by the order p. We claim that this makes it possible to model arbitrarily shaped, curved boundaries more efficiently by combining a few L p units of different orders. This insight justifies the need for learning different orders for each unit in the model. We empirically evaluate the proposed L p units on a number of datasets and show that multilayer perceptrons (MLP) consisting of the L p units achieve the state-of-the-art results on a number of benchmark datasets. Furthermore, we evaluate the proposed L p unit on the recently proposed deep recurrent neural networks (RNN). © 2014 Springer-Verlag.
CITATION STYLE
Gulcehre, C., Cho, K., Pascanu, R., & Bengio, Y. (2014). Learned-norm pooling for deep feedforward and recurrent neural networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8724 LNAI, pp. 530–546). Springer Verlag. https://doi.org/10.1007/978-3-662-44848-9_34
Mendeley helps you to discover research relevant for your work.