Layer-Wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs

Lei Huang; Jie Qin; Li Liu; Fan Zhu; Ling Shao

Conference Proceedings

Layer-Wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12347 LNCS 384-401

DOI: 10.1007/978-3-030-58536-5_23

4Citations

61Readers

Get full text

Abstract

Conditioning analysis uncovers the landscape of an optimization objective by exploring the spectrum of its curvature matrix. This has been well explored theoretically for linear models. We extend this analysis to deep neural networks (DNNs) in order to investigate their learning dynamics. To this end, we propose layer-wise conditioning analysis, which explores the optimization landscape with respect to each layer independently. Such an analysis is theoretically supported under mild assumptions that approximately hold in practice. Based on our analysis, we show that batch normalization (BN) can stabilize the training, but sometimes result in the false impression of a local minimum, which has detrimental effects on the learning. Besides, we experimentally observe that BN can improve the layer-wise conditioning of the optimization problem. Finally, we find that the last linear layer of a very deep residual network displays ill-conditioned behavior. We solve this problem by only adding one BN layer before the last linear layer, which achieves improved performance over the original and pre-activation residual networks.

Author supplied keywords

Cite

CITATION STYLE

APA

Huang, L., Qin, J., Liu, L., Zhu, F., & Shao, L. (2020). Layer-Wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12347 LNCS, pp. 384–401). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-58536-5_23

Layer-Wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs

Abstract

Author supplied keywords

Cite

Register to see more suggestions