The Case for Learned In-Memory Joins

Ibrahim Sabek; Tim Kraska

Conference ProceedingsOPEN ACCESS

The Case for Learned In-Memory Joins

Proceedings of the VLDB Endowment (2023) 16(7) 1749-1762

DOI: 10.14778/3587136.3587148

3Citations

26Readers

Abstract

In-memory join is an essential operator in any database engine. It has been extensively investigated in the database literature. In this paper, we study whether exploiting the CDF-based learned models to boost the join performance is practical. To the best of our knowledge, we are the first to fill this gap. We investigate the usage of CDF-based models and learned indexes (e.g., Recursive Model Index (RMI) and RadixSpline) in the three join categories; indexed nested loop join (INLJ), sort-based joins (SJ) and hash-based joins (HJ). Our study shows that there is room to improve the performance of the three join categories through our proposed optimized learned variants. Our experimental analysis showed that these optimized learned variants outperform the state-of-the-art techniques in many scenarios and with different datasets.

Cite

CITATION STYLE

APA

Sabek, I., & Kraska, T. (2023). The Case for Learned In-Memory Joins. In Proceedings of the VLDB Endowment (Vol. 16, pp. 1749–1762). VLDB Endowment. https://doi.org/10.14778/3587136.3587148

The Case for Learned In-Memory Joins

Abstract

Cite

Register to see more suggestions