Abstract
Datacube queries compute aggregates over database relations at a variety of granularities. Often one wants only datacube output tuples whose aggregate value satisfies a certain condition, such as exceeding a given threshold. We develop algorithms for processing a datacube query using the selection condition internally during the computation. Thus, we can safely prune parts of the computation and end up with a more efficient computation of the answer. Our first technique, called `specialization', uses the fact that a tuple in the datacube does not meet the given threshold to infer that all finer level aggregates cannot meet the threshold. Our second technique is called `generalization', and applies in the case where the actual value of the aggregate is not needed in the output, but used just to compare with the threshold. We demonstrate the efficiency of these techniques by implementing them within the sparse datacube algorithm of Ross and Srivastava. We present a performance study using synthetic and real-world data sets. Our results indicate substantial performance improvements for queries with selective conditions.
Cite
CITATION STYLE
Ross, K. A., & Zaman, K. A. (2000). Optimizing selections over datacubes. In Proceedings of the International Conference on Scientific and Statistical Database Management, SSDBM (pp. 139–152). IEEE. https://doi.org/10.1109/ssdm.2000.869784
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.