Bitpaths: Compressing Datasets Without Decreasing Predictive Performance

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The ever growing amount of data that becomes available necessitates more memory to store it. Machine learned models are becoming increasingly sophisticated and efficient in order to navigate this growing amount of data. However, not all data is relevant for a certain machine learning task and storing that irrelevant data is a waste of memory and power. To address this, we propose bitpaths: a novel pattern-based method to compress datasets using a random forest. During inference, a KNN classifier then uses the encoded training examples to make a prediction for the encoded test example. We empirically compare bitpaths’ predictive performance with the uncompressed setting. Our method can achieve compression ratios up to 80 for datasets with a large number of features without affecting the predictive performance.

Cite

CITATION STYLE

APA

Nuyts, L., Devos, L., Meert, W., & Davis, J. (2023). Bitpaths: Compressing Datasets Without Decreasing Predictive Performance. In Communications in Computer and Information Science (Vol. 1752 CCIS, pp. 261–268). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-23618-1_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free