Abstract
We propose a categorical semantics of gradient-based machine learning algorithms in terms of lenses, parametric maps, and reverse derivative categories. This foundation provides a powerful explanatory and unifying framework: it encompasses a variety of gradient descent algorithms such as ADAM, AdaGrad, and Nesterov momentum, as well as a variety of loss functions such as MSE and Softmax cross-entropy, shedding new light on their similarities and differences. Our approach to gradient-based learning has examples generalising beyond the familiar continuous domains (modelled in categories of smooth maps) and can be realized in the discrete setting of boolean circuits. Finally, we demonstrate the practical significance of our framework with an implementation in Python.
Cite
CITATION STYLE
Cruttwell, G. S. H., Gavranović, B., Ghani, N., Wilson, P., & Zanasi, F. (2022). Categorical Foundations of Gradient-Based Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13240 LNCS, pp. 1–28). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-99336-8_1
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.