In the group mutual exclusion problem [Y Joung, Asynchronous group mutual exclusion, Distrib. Comput. 13 (2000) 189], which generalizes mutual exclusion [E. Dijkstra, Solution of a problem in concurrent programming co...
详细信息
In the group mutual exclusion problem [Y Joung, Asynchronous group mutual exclusion, Distrib. Comput. 13 (2000) 189], which generalizes mutual exclusion [E. Dijkstra, Solution of a problem in concurrent programming control, Comm. ACM 8 (9) (1965) 569], a process chooses a session when it requests entry into the Critical Section. A group mutual exclusion algorithm must ensure that the mutual exclusion property holds: if two processes are in the Critical Section at the same time, then they request the same session. In addition to mutual exclusion, lockout freedom, bounded exit, and concurrent entering are basic properties that are desirable in any group mutual exclusion algorithm. Hadzilacos in [Proc. 20th Annual Symp. on Principles of Distributed Computing, 2001, pp. 100-106] first introduced a fairness condition, called first-come-first-served (FCFS), for group mutual exclusion. The only known FCFS group mutual exclusion algorithm is due to Hadzilacos [Proc. 20th Annual Symp. on Principles of Distributed Computing, 2001, pp. 100-106], and requires Theta(N-2) bounded shared registers, where N is the number of processes. We present a FCFS group mutual exclusion algorithm that uses only Theta(N) bounded shared registers. (The existence of such an algorithm was posed as an open problem by Hadzilacos.) (c) 2005 Elsevier B.V. All rights reserved.
According to the characteristics of large scale finite element method (FEM) paralleling processing on cluster computers, an optimized automatic partition approach-modified multilevel recursive spectral bisection (MRSB...
详细信息
According to the characteristics of large scale finite element method (FEM) paralleling processing on cluster computers, an optimized automatic partition approach-modified multilevel recursive spectral bisection (MRSB) is proposed. This approach is based on modification in coarsening, partition and refinement phases of multilevel recursive spectral bisection. The vertex balancing strategy (VBS) and balancing Kernighan-Lin (BKL) method are proposed and the shortcomings of multilevel recursive spectral bisection (MRSB) are overcome. It is also applied to practical problems of different geometry. The partition results show that the proposed method is valid and significant improvement is achieved.
In this paper we present an evaluation of selected parallel strategies for Simulated Annealing and Simulated Evolution, identifying the impact of various issues on the effectiveness of parallelization. Issues under co...
详细信息
ISBN:
(纸本)1595930108
In this paper we present an evaluation of selected parallel strategies for Simulated Annealing and Simulated Evolution, identifying the impact of various issues on the effectiveness of parallelization. Issues under consideration are the characteristics of these algorithms, the problem instance, and the implementation environment. Observations are presented regarding the impact of parallel strategies on runtime and achievable solution quality. Effective parallel algorithm design choices are identified, along with pitfalls to avoid. We further attempt to generalize our assessments to other heuristics.
This article presents the design of a High Level parallel Composition or CPAN (according to its Spanish acronym) that implements a parallelization of the algorithmic design technique named Branch & Bound and uses ...
详细信息
ISBN:
(纸本)0769522831
This article presents the design of a High Level parallel Composition or CPAN (according to its Spanish acronym) that implements a parallelization of the algorithmic design technique named Branch & Bound and uses it to solve the Travelling Salesman Problem (TSP), within a methodological infrastructure made up of an environment of parallel Objects, an approach to Structured parallel Programming and the Object-Orientation paradigm. A CPAN is defined as the composition of a set of parallel objects of three types: one object manager, the stages and the Collector objects. By following this idea, the Branch & Bound design technique implemented as an algorithmic parallel pattern of communication among processes and based on the model of the CPAN is shown. Thus, in this work, the CPAN Branch & Bound is added as a new pattern to the library of classes already proposed in [9], which was initially constituted by the CPANs Farm, Pipe and TreeDV that represent, respectively, the patterns of communication Farm, Pipeline and Binary Tree, the latter one implementing the design technique known as Divide and Conquer. As the programming environment used to derive the proposed CPANs, we use C++ and the POSIX standard for thread programming.
The development of efficient parallel algorithms for large scale wildfire simulations is a challenging research problem because the factors that determine wildfire behavior are complex. These factors make static paral...
详细信息
ISBN:
(纸本)3540260323
The development of efficient parallel algorithms for large scale wildfire simulations is a challenging research problem because the factors that determine wildfire behavior are complex. These factors make static parallel algorithms inefficient, especially when large number of processors is used because we cannot predict accurately the propagation of the fire and its computational requirements at runtime. In this paper, we propose an Autonomic Runtime Manager (ARM) to dynamically exploit the physics properties of the fire simulation and use them as the basis of our self-optimization algorithm. At each step of the wildfire simulation, the ARM decomposes the computational domain into several natural regions (e.g., burning, unburned, burned) where each region has the same temporal and special characteristics. The number of burning, unburned and burned cells determines the current state of the fire simulation and can then be used to accurately predict the computational power required for each region. By regularly monitoring and analyzing the state of the simulation, and using that to drive the runtime optimization, we can achieve significant performance gains because we can efficiently balance the computational load on each processor. Our experimental results show that the performance of the fire simulation has been improved by 45% when compared with a static portioning algorithm.
This paper presents an investigation into exploiting the population-based nature of Learning Classifier Systems for their use within highly-parallel systems. In particular, the use of simple accuracy-based Learning Cl...
详细信息
ISBN:
(纸本)0780393635
This paper presents an investigation into exploiting the population-based nature of Learning Classifier Systems for their use within highly-parallel systems. In particular, the use of simple accuracy-based Learning Classifier Systems within the ensemble machine approach is examined. Results indicate that inclusion of a rule migration mechanism inspired by parallel Genetic algorithms is an effective way to improve learning speed.
The problem of the longest common subsequence (LCS) is a fundamental problem in sequence alignment. In this paper, we first present fast parallel algorithms for sequence similarity with LCS. For two sequences of lengt...
详细信息
ISBN:
(纸本)3540297707
The problem of the longest common subsequence (LCS) is a fundamental problem in sequence alignment. In this paper, we first present fast parallel algorithms for sequence similarity with LCS. For two sequences of lengths m and n (m <= n), the algorithm uses n processors and costs O(m) computation time. Time-area cost of the algorithm is O(mn) which reaches optimality. Based on this algorithm, we also give a fast parallel algorithm which can compute the length of LCS in O(logm) time. To our best knowledge, this is the fastest one among the parallel LCS algorithms on array architectures.
Increasing the number of instructions executing in parallel has helped improve processor performance, but the technique is limited. Executing code on parallel threads and processors has fewer limitations, but most com...
详细信息
ISBN:
(纸本)0769524052
Increasing the number of instructions executing in parallel has helped improve processor performance, but the technique is limited. Executing code on parallel threads and processors has fewer limitations, but most computer programs tend to be serial in nature. This paper presents a compiler optimisation that at run-time parallelises code inside a JVM and thereby increases the number of threads. We show Spec JVM benchmark results for this optimisation. The performance on a current desktop processor is slower than without parallel threads, caused by thread creation costs, but with these costs removed the performance is better than the serial code. We measure the threading costs and discuss how a future computer architecture will enable this optimisation to be feasible in exploiting thread instead of instruction and/or vector parallelism.
In this paper, we present a refinement of the BSP (Bulk Synchronous parallel) cost model, in order to allow a more exact prediction of the parallel algorithms communication cost. Our approach is based on two point: (I...
详细信息
ISBN:
(纸本)0769524869
In this paper, we present a refinement of the BSP (Bulk Synchronous parallel) cost model, in order to allow a more exact prediction of the parallel algorithms communication cost. Our approach is based on two point: (I): a deepening of the benchmarks to take into account all influential factors on the word sending cost in a communication, and (II) a more elaborate manner of prediction which carefully detects the communications course context of the algorithms to be predicted.
The rapid development in space and computer technologies has made possible to store a large amount of remotely sensed image data, collected from heterogeneous sources. In particular, NASA is continuously gathering ima...
详细信息
ISBN:
(纸本)0780390504
The rapid development in space and computer technologies has made possible to store a large amount of remotely sensed image data, collected from heterogeneous sources. In particular, NASA is continuously gathering imagery data with hyperspectral sensors such as the Airborne VisibleInfrared Imaging Spectrometer (AVIRIS) or the Hyperion imager aboard Earth Observing-1 (EO-1) spacecraft. The development of efficient techniques for transforming the massive amount of collected data into scientific understanding is critical for space-based Earth science and planetary exploration. Heterogeneous networks of workstations are a very promising cost-effective parallel computing architecture. Unlike traditional homogeneous parallel platforms, heterogeneous architectures are composed of processors running at different speeds. This heterogeneity results in distributed-memory parallel computing systems created from commodity components that can satisfy, specific computational requirements for the Earth and space sciences community. This paper explores techniques for mapping hyperspectral image analysis algorithms onto heterogeneous networks of workstations. Important aspects in algorithm design such as portability, reusability and scalability are illustrated by using homogeneous and heterogeneous parallel computing facilities at NASA's Goddard Space Flight Center and, European Center for parallelism of Barcelona, and University of Extremadura in Spain. Hyperspectral image data from the AVIRIS data repository is used in experiments, which reveal that heterogeneous networks of workstations are a source of computational power that is both accessible and applicable to obtaining results quickly enough for practical use in information extraction applications from hyperspectral imagery.
暂无评论