Rprop Using the Natural Gradient

Christian Igel; Marc Toussaint; Wan Weishui

Book Chapter

Rprop Using the Natural Gradient

Igel C
Toussaint M
Weishui W

Birkhäuser Basel, (2005), 259-272

DOI: 10.1007/3-7643-7356-3_19

N/ACitations

120Readers

Get full text

Abstract

Gradient-based optimization algorithms are the standard methods foradapt-ing the weights of neural networks. The natural gradient gives thesteepestdescent direction based on a non-Euclidean, from a theoretical pointof viewmore appropriate metric in the weight space. While the natural gradienthasalready proven to be advantageous for online learning, we exploreits bene-ts for batch learning: We empirically compare Rprop (resilient backprop-agation), one of the best performing rst-order learning algorithms,usingthe Euclidean and the non-Euclidean metric, respectively. As batchsteepestdescent on the natural gradient is closely related to Levenberg-Marquardtoptimization, we add this method to our comparison.It turns out that the Rprop algorithm can indeed prot from the nat-ural gradient: the optimization speed measured in terms of weightupdatescan increase signicantly compared to the original version. Rpropbased onthe non-Euclidean metric shows at least similar performance as Levenberg-Marquardt optimization on the two benchmark problems considered andappears to be a slightly more robust. However, in Levenberg-Marquardtop-timization and Rprop using the natural gradient computing a weightupdaterequires cubic time and quadratic space. Further, both methods haveaddi-tional hyperparameters that are dicult to adjust. In contrast, conventionalRprop has linear space and time complexity, and its hyperparametersneedno dicult tuning.

Cite

CITATION STYLE

APA

Igel, C., Toussaint, M., & Weishui, W. (2005). Rprop Using the Natural Gradient. In Trends and Applications in Constructive Approximation (pp. 259–272). Birkhäuser Basel. https://doi.org/10.1007/3-7643-7356-3_19

Rprop Using the Natural Gradient

Abstract

Cite

Register to see more suggestions