Bitpaths: Compressing Datasets Without Decreasing Predictive Performance

Loren Nuyts; Laurens Devos; Wannes Meert; Jesse Davis

Conference Proceedings

Bitpaths: Compressing Datasets Without Decreasing Predictive Performance

Communications in Computer and Information Science (2023) 1752 CCIS 261-268

DOI: 10.1007/978-3-031-23618-1_18

0Citations

2Readers

Get full text

Abstract

The ever growing amount of data that becomes available necessitates more memory to store it. Machine learned models are becoming increasingly sophisticated and efficient in order to navigate this growing amount of data. However, not all data is relevant for a certain machine learning task and storing that irrelevant data is a waste of memory and power. To address this, we propose bitpaths: a novel pattern-based method to compress datasets using a random forest. During inference, a KNN classifier then uses the encoded training examples to make a prediction for the encoded test example. We empirically compare bitpaths’ predictive performance with the uncompressed setting. Our method can achieve compression ratios up to 80 for datasets with a large number of features without affecting the predictive performance.

Author supplied keywords

Cite

CITATION STYLE

APA

Nuyts, L., Devos, L., Meert, W., & Davis, J. (2023). Bitpaths: Compressing Datasets Without Decreasing Predictive Performance. In Communications in Computer and Information Science (Vol. 1752 CCIS, pp. 261–268). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-23618-1_18

Bitpaths: Compressing Datasets Without Decreasing Predictive Performance

Abstract

Author supplied keywords

Cite

Register to see more suggestions