Contemporary DBMS systems already use data-partitioning and data-flow analysis for intra-query parallelism. We study the problem of identifying data-partitioning targets. To rank candidates, we propose a simple cost m...
详细信息
ISBN:
(纸本)9781614994589
Contemporary DBMS systems already use data-partitioning and data-flow analysis for intra-query parallelism. We study the problem of identifying data-partitioning targets. To rank candidates, we propose a simple cost model that relies on plan structure, operator cost and selectivity for a given base table. We evaluate this model in various optimization schemes and observe how it affects degrees of parallelism and query execution latencies across all TPC-H queries: When compared with the existing naive model which partitions the largest physical table in the query, our approach identifies significantly better partitioning targets thus resulting in sinificantly higher degree of resource utilization and intra-query parallelism for most queries while having little impact on the remaining queries in the TPC-H benchmark.
暂无评论