Classical queryoptimization compares query plans according to one cost metric and associates each plan with a constant cost value. In this paper, we introduce the multi-objective parametric query optimization (MPQO) ...
详细信息
Classical queryoptimization compares query plans according to one cost metric and associates each plan with a constant cost value. In this paper, we introduce the multi-objective parametric query optimization (MPQO) problem where query plans are compared according to multiple cost metrics and the cost of a given plan according to a given metric is modeled as a function that depends on multiple parameters. The cost metrics may, for instance, include execution time or monetary fees;a parameter may represent the selectivity of a query predicate that is unspecified at optimization time. MPQO generalizes parametric query optimization (which allows multiple parameters but only one cost metric) and multi-objective queryoptimization (which allows multiple cost metrics but no parameters). We formally analyze the novel MPQO problem and show why existing algorithms are inapplicable. We present a generic algorithm for MPQO and a specialized version for MPQO with piecewise-linear plan cost functions. We prove that both algorithms find all relevant query plans and experimentally evaluate the performance of our second algorithm in multiple scenarios.
Commercial applications usually rely on precompiled parameterized procedures to interact with a database. Unfortunately, executing a procedure with a set of parameters different from those used at compilation time may...
详细信息
Commercial applications usually rely on precompiled parameterized procedures to interact with a database. Unfortunately, executing a procedure with a set of parameters different from those used at compilation time may be arbitrarily suboptimal. parametric query optimization (PQO) attempts to solve this problem by exhaustively determining the optimal plans at each point of the parameter space at compile time. However, PQO is likely not cost-effective if the query is executed infrequently or if it is executed with values only within a subset of the parameter space. In this paper, we propose instead to progressively explore the parameter space and build a parametric plan during several executions of the same query. We introduce algorithms that, as parametric plans are populated, are able to frequently bypass the optimizer but still execute optimal or near-optimal plans.
A problem of effective and efficient approximate query evaluation is addressed as a special case of multi-objective optimization with 2 criteria: the computational resources and the quality of result. The proposed opt...
详细信息
ISBN:
(纸本)9783319232010;9783319232003
A problem of effective and efficient approximate query evaluation is addressed as a special case of multi-objective optimization with 2 criteria: the computational resources and the quality of result. The proposed optimization and execution model provides for interactive trade of quality for speed.
Most relational query optimizers make use of information about the costs of accessing tuples and data structures on various storage devices. This information can at times be off by several orders of magnitude due to h...
详细信息
ISBN:
(纸本)9781581136340
Most relational query optimizers make use of information about the costs of accessing tuples and data structures on various storage devices. This information can at times be off by several orders of magnitude due to human error in configuration setup, sudden changes in load, or hardware failure. In this paper, we attempt to answer the following questions: Are inaccurate access cost estimates likely to cause a typical query optimizer to choose a suboptimal query plan? If an optimizer chooses a suboptimal plan as a result of inaccurate access cost estimates, how far from optimal is this plan likely to be? To address these issues, we provide a theoretical, vector-based frame-work for analyzing the costs of query plans under various storage parameter costs. We then use this geometric framework to characterize experimentally a commercial query optimizer. We develop algorithms for extracting detailed information about query plans through narrow optimizer interfaces, and we perform the characterization using database statistics from a published run of the TPC-H benchmark and a wide range of storage parameters. We show that, when data structures such as tables, indexes, and sorted runs reside on different storage devices, the optimizer can derive significant benefits from having accurate and timely information regarding the cost of accessing storage devices.
A multidatabase system (MDBS) integrates information from autonomous local databases managed by heterogeneous database management systems (DBMS) in a distributed environment. For a query involving more than one databa...
详细信息
A multidatabase system (MDBS) integrates information from autonomous local databases managed by heterogeneous database management systems (DBMS) in a distributed environment. For a query involving more than one database, global queryoptimization should be performed to achieve good overall system performance. The significant differences between an MDBS and a traditional distributed database system (DDBS) make queryoptimization in the former more challenging than in the latter. Challenges for queryoptimization in an MDBS are discussed in this paper. A two-phase optimization approach for processing a query in an MDBS is proposed. Several global queryoptimization techniques suitable for an MDBS, such as semantic queryoptimization, queryoptimization via probing queries, parametric query optimization and adaptive queryoptimization, are suggested. The architecture of a global query optimizer incorporating these techniques is designed.
Learning-based parametric query optimization (PQO) methods excel in static workloads with precise cached plan selection but struggle with dynamic workloads. When facing a parametricquery outside query parameter distr...
详细信息
ISBN:
(纸本)9789819755516;9789819755523
Learning-based parametric query optimization (PQO) methods excel in static workloads with precise cached plan selection but struggle with dynamic workloads. When facing a parametricquery outside query parameter distribution in its training workload, suboptimal plan could be selected and this plan would be unsafely reused. The root cause of this limitation is the plan selection model cannot adapt to distribution drifts in query parameters. In order to extend learning-based parametric query optimization for safely reusing cached plans in dynamic workloads, we introduce a novel approach to predict and avoid reusing a suboptimal plan, referred to as SPQO. As each cached plan has specific reuse decision boundary, each cached plan is assigned to an independent binary classifier. In the offline phase, we employ an undersampling algorithm integrated with Tomek Links technique to effectively train these classifiers under class imbalance setting. During the online phase, we implement hybrid adjustment strategies based on incremental learning, continuously training these classifiers with each prediction and query feedback. Our experiments show SPQO can reduce the 95th percentile relative query latency by 10x in static and 10(3)x in dynamic workloads, and achieve better cache hit rates on various workloads.
暂无评论