Abstract
Incremental model-update strategies are widely used in machine learning and data mining. By "incremental update" we refer to models that are updated many times using small subsets of the training data. Two wellknown examples are stochastic gradient and MCMC. Both provide fast sequential performance and have generated many of the best-performing methods for particular problems (logistic regression, SVM, LDA etc.). But these methods are difficult to adapt to parallel or cluster settings because of the overhead of distributing model updates through the network. Updates can be locally batched to reduce communication overhead, but convergence typically suffers as the batch size increases. In this paper we introduce and analyze buttery mixing, an approach which interleaves communication with computation. We evaluate buttery mixing on stochastic gradient algorithms for logistic regression and SVM, on two datasets. Results show that buttery mix steps are fast and failure-tolerant, and overall we achieved a 3.3x speed-up over full mix (AllReduce) on an Amazon EC2 cluster.
Cite
CITATION STYLE
Zhao, H., & Canny, J. (2013). Buttery mixing: Accelerating incremental-update algorithms on clusters. In Proceedings of the 2013 SIAM International Conference on Data Mining, SDM 2013 (pp. 785–793). Siam Society. https://doi.org/10.1137/1.9781611972832.87
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.