Training deep and recurrent networks with hessian-free optimization

James Martens; Ilya Sutskever

Journal Article

Training deep and recurrent networks with hessian-free optimization

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7700 LECTURE NO 479-535

DOI: 10.1007/978-3-642-35289-8_27

113Citations

228Readers

Get full text

Abstract

In this chapter we will first describe the basic HF approach, and then examine well-known performance-improving techniques such as preconditioning which we have found to be beneficial for neural network training, as well as others of a more heuristic nature which are harder to justify, but which we have found to work well in practice. We will also provide practical tips for creating efficient and bug-free implementations and discuss various pitfalls which may arise when designing and using an HF-type approach in a particular application. © Springer-Verlag Berlin Heidelberg 2012.

Cite

CITATION STYLE

APA

Martens, J., & Sutskever, I. (2012). Training deep and recurrent networks with hessian-free optimization. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7700 LECTURE NO, 479–535. https://doi.org/10.1007/978-3-642-35289-8_27

Training deep and recurrent networks with hessian-free optimization

Abstract

Cite

Register to see more suggestions