The size of a workflow representation, used in program- ming languages and runtime systems, depends on the num- ber of included tasks and their connections. Therefore, the execution of large-scale workflows is limited...
详细信息
ISBN:
(纸本)9780889868786
The size of a workflow representation, used in program- ming languages and runtime systems, depends on the num- ber of included tasks and their connections. Therefore, the execution of large-scale workflows is limited by memory size of the master node and task scheduling/transfer cost. We propose a scheme largely reducing the size of workflow representation using array contraction. Focus- ing on arrays in workflow representation, our scheme can contract such arrays dynamically, without static analysis of user code. Hierarchically parallel structures, often used in large-scale workflows, can also be contracted. As a result of evaluation on our object-oriented work- flow language MegaScript, the number of API objects in fully-contracted random workflow representations was ap- proximately 300 in average, independent from the number of tasks. The required memory size was also reduced to approximately 100KB in average.
This paper introduces MaPI;a framework that implements MapReduce abstraction. By using this framework, the developer is able to implement parallel optimization algorithms without worrying about message transmission be...
详细信息
ISBN:
(纸本)9780889868786
This paper introduces MaPI;a framework that implements MapReduce abstraction. By using this framework, the developer is able to implement parallel optimization algorithms without worrying about message transmission between the processes, or about how the system will perform the parallelization. Furthermore, the algorithm can be developed without parallel code, since the parallelization is hidden by the framework. In order to demonstrate the efficiency of this framework, an optimization heuristic algorithm is applied to the classical Traveling Salesman Problem. The results show the usability of the framework in the development of parallel optimization algorithms.
The co-operation of parallel simulated annealing processes to solve the vehicle routing problem with time windows (VRPTW) is considered. The objective is to investigate how the number of parallel processes and the fre...
详细信息
ISBN:
(纸本)9780889866386
The co-operation of parallel simulated annealing processes to solve the vehicle routing problem with time windows (VRPTW) is considered. The objective is to investigate how the number of parallel processes and the frequency of processes co-operation influence the accuracy of solutions to the VRPTW. The accuracy of solutions is measured by their proximity to the best known solution.
In this paper, we present a new method of static load balancing for parallel mining of all frequent itemsets on a distributed-memory (DM) parallel machine. The method partitions the space of all frequent itemsets into...
详细信息
ISBN:
(纸本)9780889866386
In this paper, we present a new method of static load balancing for parallel mining of all frequent itemsets on a distributed-memory (DM) parallel machine. The method partitions the space of all frequent itemsets into subspaces of approximately the same size. Hence, it allows to balance the computational load for an arbitrary frequent itemset mining algorithm.
Utilising modern computer hardware like multi-core CPUs or Graphics Processing Units (GPUs) provides programmerswith great computational power to speedup their code. However, the effort to parallelise existing softwar...
详细信息
ISBN:
(纸本)9780889869073
Utilising modern computer hardware like multi-core CPUs or Graphics Processing Units (GPUs) provides programmerswith great computational power to speedup their code. However, the effort to parallelise existing software does not always bear relation to the theoretically achievable speedup. This paper introduced a novel method for predicting the possible speedup, which can be achieved as the result of parallelisation of existing sequential source code to guide the programmer in the decision whether or not a parallelisation is worthwhile. Thereby, we consider multicore CPUs as well as many-core co-processors. Our evaluation results show that the computed speedup is similar to the real speedup, although our approach relies only on static code analysis.
Pattern matching is often implemented on the CPU to-day using deterministic finite automata (DFAs). We present methods to efficiently parallelize the DFA membership test on general-purpose graphics processing units (G...
详细信息
ISBN:
(纸本)9780889869073
Pattern matching is often implemented on the CPU to-day using deterministic finite automata (DFAs). We present methods to efficiently parallelize the DFA membership test on general-purpose graphics processing units (GPGPUs). Our partitioning scheme builds on the work of Holub and Stekr [1]. Our implementations utilize the OpenCL programming model, in which we propose a series of algorithms and related memory size constraints. Experimental results are presented on the effectiveness of these algorithms, yielding GPU speedups between 19x and 39x over the Grep utility in matching PROSITE motifs [2].
To exploit modern commodity multicore processors fully, parallel programming needs to be embraced as the norm for constructing programs. Unfortunately, this paradigm requires more expertise than the traditional sequen...
详细信息
ISBN:
(纸本)9780889869073
To exploit modern commodity multicore processors fully, parallel programming needs to be embraced as the norm for constructing programs. Unfortunately, this paradigm requires more expertise than the traditional sequential programming model and the wide variety of tools and parallel programming models available further complicates the issue. In this research we describe a selection of parallel programming tools and techniques to aid novice parallel programmers in the process of developing efficient parallel C/C++ programs for execution on shared memory multi-core CPUs. We evaluate the performance of a couple of parallelized programs, as well as the effort needed to achieve that performance. The results show that the choice of programming model is dependent on the type of problem being solved. Furthermore, parallel programming models with higher levels of abstraction require less programming effort, while providing similar performance to explicit threading models.
To keep up with the pace of fast development of Internet, cluster architecture has been proposed for next generation core routers. In a cluster router, parallel computation is expected. computing shortest path tree (S...
详细信息
ISBN:
(纸本)9780889866386
To keep up with the pace of fast development of Internet, cluster architecture has been proposed for next generation core routers. In a cluster router, parallel computation is expected. computing shortest path tree (SPT) is a fundamental problem implementing OSPF, which is one of the most popular routing protocols. This paper presents a parallel algorithm BPA (Branching parallel Algorithm) for computing SPT, analyzes the performance of BPA, and finally validates the BPA performance by experiments
By considering a distributed system composed of a set of servers, clients, and resources, which characterize environments such as Grids or Clouds, we propose a distributed algorithm for resource allocation. It exploit...
详细信息
By considering a distributed system composed of a set of servers, clients, and resources, which characterize environments such as Grids or Clouds, we propose a distributed algorithm for resource allocation. It exploits fuzzy logic whenever a server, which cannot locally satisfy a client resource allocation request, needs to decide to which remote server the request should be forwarded. Furthermore, by using the concept of logical clocks, our algorithm globally orders pending requests thus ensuring both request satisfaction fairness and lack of starvation. Performance evaluation results on top of the SimGrid simulator confirm the effectiveness of our proposal.
In this paper, we discuss parallelization of a high-level computer vision application in medical imaging, namely, multi-scale active shape description of MR (magnetic resonance) brain images of epileptic patients usin...
详细信息
ISBN:
(纸本)9780889866386
In this paper, we discuss parallelization of a high-level computer vision application in medical imaging, namely, multi-scale active shape description of MR (magnetic resonance) brain images of epileptic patients using active contour models, on a cluster of workstations. The paper gives a comparative study and analysis of three different approaches of parallel implementation using corresponding parallelcomputing patterns such as Temporal Multiplexing, Pipeline, and Composite Pipeline. The outcome of the cluster-based parallel implementations has shown encouraging results.
暂无评论