Recent Advances in Stochastic Gradient Descent in Deep Learning

73Citations
Citations of this article
137Readers
Mendeley users who have this article in their library.

Abstract

In the age of artificial intelligence, the best approach to handling huge amounts of data is a tremendously motivating and hard problem. Among machine learning models, stochastic gradient descent (SGD) is not only simple but also very effective. This study provides a detailed analysis of contemporary state-of-the-art deep learning applications, such as natural language processing (NLP), visual data processing, and voice and audio processing. Following that, this study introduces several versions of SGD and its variant, which are already in the PyTorch optimizer, including SGD, Adagrad, adadelta, RMSprop, Adam, AdamW, and so on. Finally, we propose theoretical conditions under which these methods are applicable and discover that there is still a gap between theoretical conditions under which the algorithms converge and practical applications, and how to bridge this gap is a question for the future.

Cite

CITATION STYLE

APA

Tian, Y., Zhang, Y., & Zhang, H. (2023, February 1). Recent Advances in Stochastic Gradient Descent in Deep Learning. Mathematics. MDPI. https://doi.org/10.3390/math11030682

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free