This paper describes the design philosophy for the Grid system being developed by Japan Committee on High-Performance Computing for Bioinformatics and Initiative for parallel Bioinformatics (IPAB). Grid is one of attr...
详细信息
ISBN:
(纸本)0769516599
This paper describes the design philosophy for the Grid system being developed by Japan Committee on High-Performance Computing for Bioinformatics and Initiative for parallel Bioinformatics (IPAB). Grid is one of attractive solutions to achieve distributed bioinformtics environment with high performance parallel computers, large genomic databases, computation intensive applications such as homology search and molecular simulation. However, much has been remained in Grid system design especially in the wide area network environment. OBIGrid emphasizes the virtual organization aspect of the Grid system and gives more priority on security and scalability rather than performance.
The classical redistribution problem aims at optimally scheduling communications when moving from an initial data distribution to a target distribution where each processor will host a subset of data items. However, m...
详细信息
ISBN:
(纸本)9781479959198
The classical redistribution problem aims at optimally scheduling communications when moving from an initial data distribution to a target distribution where each processor will host a subset of data items. However, modern computing platforms are equipped with a powerful interconnection switch, and the cost of a given communication is (almost) independent of the location of its sender and receiver. This leads to generalizing the redistribution problem as follows: find the optimal one-toone mapping of the subsets of data items onto the processors for which the cost of the redistribution is minimal. This paper studies the complexity of this generalized problem. We provide optimal algorithms and evaluate their gain over classical redistribution through simulations. We also show the NP-hardness of the problem to find the optimal data partition and processor permutation (defined by new subsets) that minimize the cost of redistribution followed by a simple computation kernel.
Modern data generation is enormous;we now capture events at increasingly fine granularity, and require processing at rates approaching real-time. For graph analytics, this explosion in data volumes and processing dema...
详细信息
ISBN:
(纸本)9781728112466
Modern data generation is enormous;we now capture events at increasingly fine granularity, and require processing at rates approaching real-time. For graph analytics, this explosion in data volumes and processing demands has not been matched by improved algorithmic or infrastructure techniques. Instead of exploring solutions to keep up with the velocity of the generated data, most of today's systems focus on analyzing individually built historic snapshots. Modern graph analytics pipelines must evolve to become viable at massive scale, and move away from static, post-processing scenarios to support on-line analysis. This paper presents our progress towards a system that analyzes dynamic incremental graphs, responsive at single-change granularity. We present an algorithmic structure using principles of recursive updates and monotonic convergence, and a set of incremental graph algorithms that can be implemented based on this structure. We also present the required middleware to support graph analytics at fine, event-level granularity. We envision that graph topology changes are processed asynchronously, concurrently, and independently (without shared state), converging an algorithm's state (e.g. single-source shortest path distances, connectivity analysis labeling) to its deterministic answer. The expected long-term impact of this work is to enable a transition away from offfine graph analytics, allowing knowledge to be extracted from networked systems in real-time.
Reasoning about, and debugging, hierarchical control plane systems is hard. Moreover, OpenTracing, the industry adopted tracing model, has problems with tracing activities in presence of coalescing effects, which mate...
详细信息
ISBN:
(纸本)9781665432818
Reasoning about, and debugging, hierarchical control plane systems is hard. Moreover, OpenTracing, the industry adopted tracing model, has problems with tracing activities in presence of coalescing effects, which materialize, among others, in cloud platforms and build systems. In our earlier work we have proposed a novel approach to distributedsystems tracing, based on an extension of OpenTracing. The aim of this contribution is to outline how the proposed approach can be implemented.
Building correct distributedsystems is challenging, and any attempt for providing a direct, global proof of correctness of a distributed system is bound to fail. An interesting alternative approach consists in starti...
详细信息
ISBN:
(纸本)9781479959198
Building correct distributedsystems is challenging, and any attempt for providing a direct, global proof of correctness of a distributed system is bound to fail. An interesting alternative approach consists in starting from a specification or program of the system under construction, verifying all properties of interest on it - which has a much lower complexity than the verification on a distributed implementation - and finally derive a distributed implementation using some correct by-construction approach. Note that this topic is related to distributed control, where the objective is to enforce in a distributed manner some global constraint on a plant. Deriving such a distributed controller directly is difficult, and the correctness of the resulting controller is difficult to prove. A more feasible approach in this context is to first construct a global controller, then transform it into distributed one, again by means of a correct-by-construction approach.
This special issue presents new trends in computer architecture and in parallel and distributedsystems. It is based on the best papers of the 24th internationalsymposium on Computer Architecture and High Performance...
详细信息
This special issue presents new trends in computer architecture and in parallel and distributedsystems. It is based on the best papers of the 24th internationalsymposium on Computer Architecture and High Performance Computing, which was held in New York, NY, USA on October 24-26, 2012 in the Columbia University. The authors were invited to provide extended versions of the papers presented in the conference, taking into account suggestions by the double-blinded peer review process and comments gathered during the conference.
Commercial database systems must typically rely on fast hardware platforms and interconnects to deal efficiently with data in parallel. However, cheap computing power can be applied for flexibility and scalability in ...
详细信息
ISBN:
(纸本)0769523129
Commercial database systems must typically rely on fast hardware platforms and interconnects to deal efficiently with data in parallel. However, cheap computing power can be applied for flexibility and scalability in managing large data volumes if the right choices are made concerning data placement and processing. Our work concentrates on the use of cheap computing power in possibly slow, non-dedicated local networks to achieve a computing power over demanding query-intensive databases that would be unachievable without expensive specialized hardware and massively parallelsystems. The Node Partitioned Data Management System (NPDM) works on computing nodes on non-dedicated local networks. In this paper we concentrate on query transformations required for efficient processing over a specialized query-intensive schema. The decision support benchmark TPC-H is used as a study case for the transformations and for experimental analysis.
In here we consider the problem of parallel execution of Join operation by a J2EE cluster. J2EE clusters are intended for coarse-grain distributed processing of multiple queries/business transactions over the Web. Thu...
详细信息
ISBN:
(纸本)0769522106
In here we consider the problem of parallel execution of Join operation by a J2EE cluster. J2EE clusters are intended for coarse-grain distributed processing of multiple queries/business transactions over the Web. Thus, the possiblity of using it J2EE cluster for fine-grain parallel computations (parallel Joins in our case) is intriguing and of practical interest. We have developed a new variant of the SFR algorithm for parallel computation of Cartesian Product in Join operations and proved its optimality in terms of communication/execution-time tradeoffs via a simple lower bound. Our experimental results show that despite the fact that J2EE is considered to be a platform that uses a complex interfaces and software entities, such as various types of Java beans, J2EE clusters can be efficiently used to execute Join operation in parallel.
Stream processing systems have become important, as applications like media broadcasting, sensor network monitoring and on-line data analysis increasingly rely on real-time stream processing. Such systems are often ch...
详细信息
ISBN:
(纸本)9781424437511
Stream processing systems have become important, as applications like media broadcasting, sensor network monitoring and on-line data analysis increasingly rely on real-time stream processing. Such systems are often challenged by the bursty nature of the applications. In this paper, we present BARRE (Burst Accommodation through Rate REconfiguration), a system to address the problem of bursty data streams in distributed stream processing systems. Upon the emergence of a burst, BARRE dynamically reserves resources dispersed across the nodes of a distributed stream processing system, based on the requirements of each application as well as the resources available on the nodes. Our experimental results over our Synergy distributed stream processing system demonstrate the efficiency of our approach.
distributed embedded systems are increasingly prevalent in numerous applications, and with pervasive network access within these systems, security is also a critical design concern. In this paper, we present a modelin...
详细信息
ISBN:
(纸本)9781509036820
distributed embedded systems are increasingly prevalent in numerous applications, and with pervasive network access within these systems, security is also a critical design concern. In this paper, we present a modeling and optimization framework for distributed reconfigurable embedded systems, which maps tasks on a distributed embedded system with the goal of optimizing latency, energy, and/or security across all computing and communication levels. The proposed modeling framework for dataflow applications integrates models for computational latency, security levels for inter-task and intra-task communication, communication latency, and power consumption. We evaluate the proposed methodology using a video-based object detection and tracking application.
暂无评论