Stream join is a fundamental and important processing in many real-world applications. Due to the complexity of join operation and the inherent characteristic of streaming data (e.g., skewed distribution and dynamics), though massive research has been conducted, adaptivity and load-balancing are still urgent problems. In this paper, an enhanced adaptive join-matrix system AdaptMX for stream theta-join is presented, which combines the key-based and tuple-based join approaches well: (i) at outer level, it modifies the well-known join-matrix model to allocate resource on demand, improving the adaptivity of tuple-based parititoning scheme; (ii) at inner level, it adopts a key-based routing policy among grouped processing tasks to maintain the join semantics and cost-effective load balancing strategies to remove the stragglers. For demonstration, we present a transparent processing of distributed stream theta-join and compare the performance of our AdaptMX system with other baselines, with 3 × higher throughput.
CITATION STYLE
Wang, X., Jiang, C., Fang, J., Wang, X., & Zhang, R. (2018). AdaptMX: Flexible join-matrix streaming system for distributed theta-joins. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10828 LNCS, pp. 802–806). Springer Verlag. https://doi.org/10.1007/978-3-319-91458-9_52
Mendeley helps you to discover research relevant for your work.