In the context of incremental view maintenance (IVM), delta query derivation is an essential technique for speeding up the processing of large, dynamic datasets. The goal is to generate delta queries that, given a sma...
详细信息
ISBN:
(纸本)9781450341912
In the context of incremental view maintenance (IVM), delta query derivation is an essential technique for speeding up the processing of large, dynamic datasets. The goal is to generate delta queries that, given a small change in the input, can update the materialized view more efficiently than via recomputation. In this work we propose the first solution for the efficient incrementalization of positive nested relational calculus (NRC+) on bags (with integer multiplicities). More precisely, we model the cost of NRC+ operators and classify queries as efficiently incrementalizable if their delta has a strictly lower cost than full re-evaluation. Then, we identify IncNRC(+), a large fragment of NRC+ that is efficiently incrementalizable and we provide a semantics-preserving translation that takes any NRC+ query to a collection of IncNRC(+) queries. Furthermore, we prove that incremental maintenance for NRC+ is within the complexity class NC0 and we showcase how recursive IVM, a technique that has provided significant speedups over traditional IVM in the case of flat queries [25], can also be applied to IncNRC(+).
One of the performance bottlenecks of nested recursive computations is the intermediate collections created at different levels of recursion. The existing techniques such as vertical and horizontal loop fusion do not ...
详细信息
ISBN:
(纸本)9781450399203
One of the performance bottlenecks of nested recursive computations is the intermediate collections created at different levels of recursion. The existing techniques such as vertical and horizontal loop fusion do not remove such intermediate allocations. This paper proposes deep fusion, a technique for the efficient compilation of nested recursive computation over collections. The input to our compilation framework is a high-level functional program that can represent computations on flat and nested collections such as lists, sets, bags, and maps. The intermediate collections are removed in three levels. First, the immutable collections are translated into mutable ones by leveraging in-place updates using the destination-passing style technique. Second, deep fusion enables the inner level of recursion to reuse the destinations of the outer levels for in-place updates. Third, deep fusion removes the need to allocate tiny intermediate collections at different depths of recursion. Our experiments show that deep fusion can improve the performance of nested recursion over nested lists and maps.
Many map-reduce frameworks as well as NoSQL systems rely on collection programming as their interface of choice due to its rich semantics along with an easily parallelizable set of primitives. Unfortunately, the poten...
详细信息
Many map-reduce frameworks as well as NoSQL systems rely on collection programming as their interface of choice due to its rich semantics along with an easily parallelizable set of primitives. Unfortunately, the potential of collection programming is not entirely fulfilled by current systems as they lack efficient incremental view maintenance (IVM) techniques for queries producing large nested results. This comes as a consequence of the fact that the nesting of collections does not enjoy the same algebraic properties underscoring the optimization potential of typical collection processing constructs. We propose the first solution for the efficient incrementalization of collection programming in terms of its core constructs as captured by the positive nested relational calculus ( NRC + ) on bags (with integer multiplicities). We take an approach based on delta query derivation, whose goal is to generate delta queries which, given a small change in the input, can update the materialized view more efficiently than via recomputation. More precisely, we model the cost of NRC + operators and classify queries as efficiently incrementalizable if their delta has a strictly lower cost than full re-evaluation. Then, we identify IncNRC +, a large fragment of NRC + that is efficiently incrementalizable and we provide a semantics-preserving translation that takes any NRC + query to a collection of IncNRC + queries. Furthermore, we prove that incremental maintenance for NRC + is within the complexity class NC 0 and we showcase how Recursive IVM, a technique that has provided significant speedups over traditional IVM in the case of flat queries, can also be applied to IncNRC +. Existing systems are also limited wrt. the size of inner collections that they can effectively handle before running into severe performance bottlenecks. In particular, in the face of nested collections with skewed cardinalities developers typically have to undergo a painful process of manual query re-writes
暂无评论