Selectivity is a parameter used by a query optimizer for estimating the size of data that satisfies a query condition. Calculation of selectivity requires some representation of distribution of attribute values. Commonly, one-dimensional histograms that describe distributions of single attribute are used in DBMSes. A multidimensional (m-d) representation is required for complex queries with a range selection condition based on many attributes. Storing m-d representation directly (e.g. m-d histogram) is very space consuming for high dimensions hence the copula-based approach is proposed where we only need to store a few parameters. By using very few parameters of copula we achieve the method more accurate in selectivity estimation than the method based on attribute values independence which is commonly used by database management systems. The paper presents a software module which provides the copula-based method of selectivity estimation for a m-d range query. The presented solution is based on R Serve and it is integrated with Oracle DBMS. Some additional advantages of the module result from caching selectivities values for similar conditions are shown.
CITATION STYLE
Augustyn, D. R. (2018). Copula-based module for selectivity estimation of multidimensional range queries. In Advances in Intelligent Systems and Computing (Vol. 659, pp. 569–580). Springer Verlag. https://doi.org/10.1007/978-3-319-67792-7_55
Mendeley helps you to discover research relevant for your work.