Similarity join over array data

19Citations
Citations of this article
23Readers
Mendeley users who have this article in their library.

Abstract

Scientific applications are generating an ever-increasing volume of multi-dimensional data that are largely processed inside distributed array databases and frameworks. Similarity join is a fundamental operation across scientific workloads that requires complex processing over an unbounded number of pairs of multi-dimensional points. In this paper, we introduce a novel distributed similarity join operator for multi-dimensional arrays. Unlike immediate extensions to array join and relational similarity join, the proposed operator minimizes the overall data transfer and network congestion while providing load-balancing, without completely repartitioning and replicating the input arrays. We define formally array similarity join and present the design, optimization strategies, and evaluation of the first array similarity join operator.

Cite

CITATION STYLE

APA

Zhao, W., Rusu, F., Dong, B., & Wu, K. (2016). Similarity join over array data. In Proceedings of the ACM SIGMOD International Conference on Management of Data (Vol. 26-June-2016, pp. 2007–2022). Association for Computing Machinery. https://doi.org/10.1145/2882903.2915247

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free