Subsampling the concurrent AdaBoost algorithm: An efficient approach for large datasets

Héctor Allende-Cid; Diego Acuña; Héctor Allende

Conference ProceedingsOPEN ACCESS

Subsampling the concurrent AdaBoost algorithm: An efficient approach for large datasets

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10125 LNCS 318-325

DOI: 10.1007/978-3-319-52277-7_39

0Citations

5Readers

Abstract

In this work we propose a subsampled version of the Concurrent AdaBoost algorithm in order to deal with large datasets in an efficient way. The proposal is based on a concurrent computing approach focused on improving the distribution weight estimation in the algorithm, hence obtaining better capacity of generalization. On each round, we train in parallel several weak hypotheses, and using a weighted ensemble we update the distribution weights of the following boosting rounds. Instead of creating resamples of size equal to the original dataset, we subsample the datasets in order to obtain a speed-up in the training phase. We validate our proposal with different resampling sizes using 3 datasets, obtaining promising results and showing that the size of the resamples does not affect considerably the performance of the algorithm, but the execution time improves greatly.

Author supplied keywords

Cite

CITATION STYLE

APA

Allende-Cid, H., Acuña, D., & Allende, H. (2017). Subsampling the concurrent AdaBoost algorithm: An efficient approach for large datasets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10125 LNCS, pp. 318–325). Springer Verlag. https://doi.org/10.1007/978-3-319-52277-7_39

Subsampling the concurrent AdaBoost algorithm: An efficient approach for large datasets

Abstract

Author supplied keywords

Cite

Register to see more suggestions