Large-scale machine learning with stochastic gradient descent

Léon Bottou

Conference Proceedings

Large-scale machine learning with stochastic gradient descent

Bottou L

Proceedings of COMPSTAT 2010 - 19th International Conference on Computational Statistics, Keynote, Invited and Contributed Papers (2010) 177-186

DOI: 10.1007/978-3-7908-2604-3_16

4.2kCitations

2.0kReaders

Get full text

Abstract

During the last decade, the data sizes have grown faster than the speed of processors. In this context, the capabilities of statistical machine learning methods is limited by the computing time rather than the sample size. A more precise analysis uncovers qualitatively different tradeoffs for the case of small-scale and large-scale learning problems. The large-scale case involves the computational complexity of the underlying optimization algorithm in non- Trivial ways. Unlikely optimization algorithms such as stochastic gradient descent show amazing performance for large-scale problems. In particular, second order stochastic gradient and averaged stochastic gradient are asymptotically efficient after a single pass on the training set. © Springer-Verlag Berlin Heidelberg 2010.

Author supplied keywords

Cite

CITATION STYLE

APA

Bottou, L. (2010). Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT 2010 - 19th International Conference on Computational Statistics, Keynote, Invited and Contributed Papers (pp. 177–186). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-7908-2604-3_16

Large-scale machine learning with stochastic gradient descent

Abstract

Author supplied keywords

Cite

Register to see more suggestions