With the addition of lambda expressions and the Stream API in java 8, java has gained a powerful and expressive query language that operates over in-memory collections of java objects, making the transformation and an...
详细信息
With the addition of lambda expressions and the Stream API in java 8, java has gained a powerful and expressive query language that operates over in-memory collections of java objects, making the transformation and analysis of data more convenient, scalable and efficient. In this paper, we build on java 8 Stream and add a DistributableStream abstraction that supports federated query execution over an extensible set of distributed compute engines. Each query eventually results in the creation of a materialized result that is returned either as a local object or as an engine defined distributed java Collection that can be saved and/or used as a source for future queries. Distinctively, DistributableStream supports the changing of compute engines both between and within a query, allowing different parts of a computation to be executed on different platforms. At execution time, the query is organized as a sequence of pipelined stages, each stage potentially running on a different engine. Each node that is part of a stage executes its portion of the computation on the data available locally or produced by the previous stage of the computation. This approach allows for computations to be assigned to engines based on pricing, data locality, and resource availability. Coupled with the inherent laziness of stream operations, this brings great flexibility to query planning and separates the semantics of the query from the details of the engine used to execute it. We currently support three engines, Local, Apache Hadoop MapReduce and Oracle Coherence, and we illustrate how new engines and data sources can be added.
Tato práce se zabývá testováním výkonnosti java kolekcí na vícejádrových systémech. Cílem práce bylo nastudovat kolekce z rámce java Collection Fra...
详细信息
Tato práce se zabývá testováním výkonnosti java kolekcí na vícejádrových systémech. Cílem práce bylo nastudovat kolekce z rámce java Collection Framework a také některé další kolekce z balíku *** a projektu Javolution. Pro tyto kolekce bylo za úkol navrhnout vhodné zátěžové testy, na základě kterých bude možné porovnat výkonnost jednotlivých kolekcí. Základní řešení obnáší implementaci navržených testů v jazyce java a jejich závěrečné vyhodnocení.
暂无评论