A recent research interest on deep neural networks is to understand why deep networks are preferred to shallow networks. In this article, we considered an advantage of a deep structure in realizing a heaviside function in training. This is significant not only as simple classification problems but also as a basis in constructing general non-smooth functions. A heaviside function can be well approximated by a difference of ReLUs if we can set extremely large weight values. However, it is not so easy to attain them in training. We showed that a heaviside function can be well represented without large weight values if we employ a deep structure. We also showed that update terms of weights at input side can be necessarily large if a network is trained to realize a heaviside function. Therefore, apparent acceleration of training is brought about by setting a small learning rate. As a result, we can say that, by employing a deep structure, a good fitting of heaviside function can be obtained within a reasonable training time under a moderate small learning rate. Our results suggest that a deep structure is effective in a practical training that requires a discontinuous output.
CITATION STYLE
Hagiwara, K. (2018). On a fitting of a heaviside function by deep ReLU neural networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11301 LNCS, pp. 59–69). Springer Verlag. https://doi.org/10.1007/978-3-030-04167-0_6
Mendeley helps you to discover research relevant for your work.