The goal of queryoptimization in query federation over linked data is to minimize the response time and the completion time. Communication time has the highest impact on them both. Static queryoptimization can end u...
详细信息
The goal of queryoptimization in query federation over linked data is to minimize the response time and the completion time. Communication time has the highest impact on them both. Static queryoptimization can end up with inefficient execution plans due to unpredictable data arrival rates and missing statistics. This study is an extension of adaptive join operator which always begins with symmetric hash join to minimize the response time, and can change the join method to bind join to minimize the completion time. The authors extend adaptive join operator with bind-bloom join to further reduce the communication time and, consequently, to minimize the completion time. They compare the new operator with symmetric hash join, bind join, bind-bloom join, and adaptive join operator with respect to the response time and the completion time. Performance evaluation shows that the extended operator provides optimal response time and further reduces the completion time. Moreover, it has the adaptation ability to different data arrival rates.
A multidatabase system (MDBS) integrates information from autonomous local databases managed by heterogeneous database management systems (DBMS) in a distributed environment. For a query involving more than one databa...
详细信息
A multidatabase system (MDBS) integrates information from autonomous local databases managed by heterogeneous database management systems (DBMS) in a distributed environment. For a query involving more than one database, global queryoptimization should be performed to achieve good overall system performance. The significant differences between an MDBS and a traditional distributed database system (DDBS) make queryoptimization in the former more challenging than in the latter. Challenges for queryoptimization in an MDBS are discussed in this paper. A two-phase optimization approach for processing a query in an MDBS is proposed. Several global queryoptimization techniques suitable for an MDBS, such as semantic queryoptimization, queryoptimization via probing queries, parametric queryoptimization and adaptive query optimization, are suggested. The architecture of a global query optimizer incorporating these techniques is designed.
For two different types of data stream - a relational stream and an XML stream, this paper contains an adaptive query optimization strategy called a selection-early. The evaluation order of selection constructs can si...
详细信息
ISBN:
(纸本)9781467320887;9781467320870
For two different types of data stream - a relational stream and an XML stream, this paper contains an adaptive query optimization strategy called a selection-early. The evaluation order of selection constructs can significantly influence the overall performance of multiple query evaluation. Consequently, based on the filtering capability of the current evaluation sequence of selection constructs dynamically captured at run-time, a selection-early strategy establishes the efficient evaluation sequence adaptively. For this purpose, the overall filtering capability of the current evaluation sequence is periodically monitored and the sequence is rearranged when its filtering capability is varied higher than a specific threshold. Accordingly, this strategy keeps the current evaluation sequence to be as efficient as possible by coping with a dynamic variation of its filtering capability. The experimental studies of the proposed strategy show that it is practically more scalable and stable than other approaches.
Many Data-Intensive Scalable Computing (DISC) systems do not support sophisticated cost-based query optimizers because they lack the necessary data statistics. Consequently, many crucial optimizations, such as join or...
详细信息
ISBN:
(纸本)9781450360111
Many Data-Intensive Scalable Computing (DISC) systems do not support sophisticated cost-based query optimizers because they lack the necessary data statistics. Consequently, many crucial optimizations, such as join order and plan selection, are not well supported in DISC systems. RIOS is a Runtime Integrated Optimizer for Spark that lazily binds to execution plans at runtime, after collecting the statistics needed to make more optimal decisions. We evaluate the efficacy of our approach and show that better plans can be derived at runtime, achieving more than an order-of-magnitude performance improvement compared to compile time generated plans produced by the Apache Spark rule-base optimizer.
暂无评论