Copula-based module for selectivity estimation of multidimensional range queries

Dariusz Rafal Augustyn

Conference Proceedings

Copula-based module for selectivity estimation of multidimensional range queries

Augustyn D

Advances in Intelligent Systems and Computing (2018) 659 569-580

DOI: 10.1007/978-3-319-67792-7_55

0Citations

1Readers

Get full text

Abstract

Selectivity is a parameter used by a query optimizer for estimating the size of data that satisfies a query condition. Calculation of selectivity requires some representation of distribution of attribute values. Commonly, one-dimensional histograms that describe distributions of single attribute are used in DBMSes. A multidimensional (m-d) representation is required for complex queries with a range selection condition based on many attributes. Storing m-d representation directly (e.g. m-d histogram) is very space consuming for high dimensions hence the copula-based approach is proposed where we only need to store a few parameters. By using very few parameters of copula we achieve the method more accurate in selectivity estimation than the method based on attribute values independence which is commonly used by database management systems. The paper presents a software module which provides the copula-based method of selectivity estimation for a m-d range query. The presented solution is based on R Serve and it is integrated with Oracle DBMS. Some additional advantages of the module result from caching selectivities values for similar conditions are shown.

Author supplied keywords

Cite

CITATION STYLE

APA

Augustyn, D. R. (2018). Copula-based module for selectivity estimation of multidimensional range queries. In Advances in Intelligent Systems and Computing (Vol. 659, pp. 569–580). Springer Verlag. https://doi.org/10.1007/978-3-319-67792-7_55

Copula-based module for selectivity estimation of multidimensional range queries

Abstract

Author supplied keywords

Cite

Register to see more suggestions