We study the performance of a family of randomized parallel coordinate descent methods for minimizing a nonsmooth nonseparable convex function. The problem class includes as a special case L1-regularized L1 regression and the minimization of the exponential loss (“AdaBoost problem”). We assume that the input data defining the loss function is contained in a sparse $$m\times n$$ matrix A with at most $$\omega $$ nonzeros in each row and that the objective function has a “max structure”, allowing us to smooth it. Our main contribution consists in identifying parameters with a closed-form expression that guarantees a parallelization speedup that depends on basic quantities of the problem (like its size and the number of processors). The theory relies on a fine study of the Lipschitz constant of the smoothed objective restricted to low dimensional subspaces and shows an increased acceleration for sparser problems.
CITATION STYLE
Fercoq, O., & Richtárik, P. (2019). Smooth Minimization of Nonsmooth Functions with Parallel Coordinate Descent Methods. In Springer Proceedings in Mathematics and Statistics (Vol. 279, pp. 57–96). Springer New York LLC. https://doi.org/10.1007/978-3-030-12119-8_4
Mendeley helps you to discover research relevant for your work.