Evaluating parallel minibatch training for machine learning applications

Stephan Dreiseitl

Conference Proceedings

Evaluating parallel minibatch training for machine learning applications

Dreiseitl S

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 10671 LNCS 400-407

DOI: 10.1007/978-3-319-74718-7_48

0Citations

6Readers

Get full text

Abstract

The amount of data available for analytics applications continues to rise. At the same time, there are some application areas where security and privacy concerns prevent liberal dissemination of data. Both of these factors motivate the hypothesis that machine learning algorithms may benefit from parallelizing the training process (for large amounts of data) and/or distributing the training process (for sensitive data that cannot be shared). We investigate this hypothesis by considering two real-world machine learning tasks (logistic regression and sparse autoencoder), and empirically test how a model’s performance changes when its parameters are set to the arithmetic means of parameters of models trained on minibatches, i.e., horizontally split portions of the data set. We observe that iterating the minibatch training and parameter averaging process for a small number of times results in models with performance only slightly worse that of models trained on the full data sets.

Author supplied keywords

Cite

CITATION STYLE

APA

Dreiseitl, S. (2018). Evaluating parallel minibatch training for machine learning applications. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10671 LNCS, pp. 400–407). Springer Verlag. https://doi.org/10.1007/978-3-319-74718-7_48

Evaluating parallel minibatch training for machine learning applications

Abstract

Author supplied keywords

Cite

Register to see more suggestions