Stochastic gradient-boosted decision trees are widely employed for multivariate classification and regression tasks. This paper presents a speed-optimized and cache-friendly implementation for multivariate classification called FastBDT. The concepts used to optimize the execution time are discussed in detail in this paper. The key ideas include: an equal-frequency binning on the input data, which allows replacing expensive floating-point with integer operations, while at the same time increasing the quality of the classification; a cache-friendly linear access pattern to the input data, in contrast to usual implementations, which exhibit a random access pattern. FastBDT provides interfaces to C/C++, Python and TMVA. It is extensively used in the field of high energy physics (HEP) by the Belle II experiment.
CITATION STYLE
Keck, T. (2017). FastBDT: A Speed-Optimized Multivariate Classification Algorithm for the Belle II Experiment. Computing and Software for Big Science, 1(1). https://doi.org/10.1007/s41781-017-0002-8
Mendeley helps you to discover research relevant for your work.