Abstract
For complex data mining queries, query optimization issues arise, similar to those for the traditional database queries. However, few works have applied the cost-based query optimization, which is the key technique in optimizing traditional database queries, on complex mining queries. In this work, we develop a cost-based query optimization framework to an important collection of data mining queries, i.e. frequent pattern mining across multiple databases. Specifically, we make the following contributions: 1) We present a rich class of queries on mining frequent itemsets across multiple datasets supported by a SQL-based mechanism. 2) We present an approach to enumerate all possible query plans for the mining queries, and develop a dynamic programming approach and a branch-and-bound approach based on the enumeration algorithm to find optimal query plans with the least mining cost. 3) We introduce models to estimate the cost of individual mining operators. 4) We evaluate our query optimization techniques on both real and synthetic datasets and show significant performance improvements. Copyright 2008 ACM.
Cite
CITATION STYLE
Jin, R., Fuhry, D., & Alali, A. (2008). Cost-based query optimization for complex pattern mining on multiple databases. In Advances in Database Technology - EDBT 2008 - 11th International Conference on Extending Database Technology, Proceedings (pp. 380–391). Association for Computing Machinery. https://doi.org/10.1145/1353343.1353391
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.