this paper presents a two-level parallel evolutionary algorithm for solving function optimization problem containing multiple solutions.. By combining the characteristics of both global search and local search, the fo...
详细信息
ISBN:
(纸本)0769515126
this paper presents a two-level parallel evolutionary algorithm for solving function optimization problem containing multiple solutions.. By combining the characteristics of both global search and local search, the former enables individual to draw closer to each optimal solution and keeps the genetic diversity,of individuals. then different individuals are selected fort local evolution in their appropriate neighborhood. this simple as well as easy-to-handle algorithm turns out to be very practical according to the numerical experiments which indicate that all optimal solutions can be found out by running once of the algorithm within a fairly short period of time.
the paper concerns the parallel computing and its application for solving the full Lyapunov exponents in the general nonlinear parameter-dependent continuous ordinary differential equations. Based on a standard serial...
详细信息
ISBN:
(纸本)0769515126
the paper concerns the parallel computing and its application for solving the full Lyapunov exponents in the general nonlinear parameter-dependent continuous ordinary differential equations. Based on a standard serial algorithm developed by Wolf et al.'s [1], we present a parallel algorithm using the block-cyclic decomposition method, and then apply it for solving the Lyapunov exponents of a continuous differential equation. By testing its performance of the parallel algorithm on the supercomputer DAWNING-2000II, it is proved that the parallel algorithm is of high level parallelism, no need for message passing (little communication cost), and little I/O. In addition, the algorithm can be extended to any high dimensional ordinary differential equations.
PELCR is an environment for, lambda-terms reduction on parallel/distributed computing systems. the computation performed in this environment is a distributed graph rewriting and a major optimization to achieve efficie...
详细信息
ISBN:
(纸本)3540440496
PELCR is an environment for, lambda-terms reduction on parallel/distributed computing systems. the computation performed in this environment is a distributed graph rewriting and a major optimization to achieve efficient execution consists of a message aggregation technique exhibiting the potential for strong reduction of the communication overhead. In this paper we discuss the interaction between the effectiveness of aggregation and the schedule sequence of rewriting operations. then we present a Priority Based (BP) scheduling algorithm well suited for the specific aggregation technique. Results on a classical benchmark A-term demonstrate that PB allows PELCR to achieve up to 88% of the ideal speedup while executing on a shared memory parallel architecture.
the proceedings contain 140 papers. the special focus in this conference is on parallelprocessing. the topics include: Orchestrating computations on the world-wide web;non-massive, non-high performance, distributed c...
ISBN:
(纸本)3540440496
the proceedings contain 140 papers. the special focus in this conference is on parallelprocessing. the topics include: Orchestrating computations on the world-wide web;non-massive, non-high performance, distributed computing;facts on performance evaluation and its dependence on workloads;concepts and technologies for a worldwide grid infrastructure;a performance analysis tool for distributed and parallel programs;a hybrid strategy for automated performance problem searches;on the scalability of tracing mechanisms;component based problem solving environment;integrating temporal assertions into a parallel debugger;performance evaluation, analysis and optimization;prototyping and verifying stream-processing systems;symbolic cost estimation of parallel applications;performance modeling and interpretive simulation of PIM architectures and applications;extended overhead analysis for openMP;a call-graph based automatic tool for capture of hardware performance metrics for MPI and openMP applications;performance tuning through source code interdependence;on scheduling task-graphs to logP-machines with disturbances;optimal scheduling algorithms for communication constrained parallelprocessing;an automatic scheduler for parallel machines;non-approximability results for the hierarchical communication problem with a bounded number of clusters;non-approximability of the bulk synchronous task scheduling problem;adjusting time slices to apply coscheduling techniques in a non-dedicated now;a semi-dynamic multiprocessor scheduling algorithm with an asymptotically optimal competitive ratio;tiling and memory reuse for sequences of nested loops;towards detection of coarse-grain loop-level parallelism in irregular computations and parallel and distributed databases, data mining and knowledge discovery.
Tree search algorithms play an important role in many applications in the field of artificial intelligence. When playing board games like chess etc., computers use game tree search algorithms to evaluate a position. I...
详细信息
ISBN:
(纸本)3540440496
Tree search algorithms play an important role in many applications in the field of artificial intelligence. When playing board games like chess etc., computers use game tree search algorithms to evaluate a position. In this paper, we present a procedure that we call parallel Controlled Conspiracy Number Search (parallel CCNS). Briefly, we describe the principles of the sequential CCNS algorithm, which bases its approximation results on irregular subtrees of the entire game tree. We have parallelized CCNS and implemented it in our chess program ***, which now is the first in the world that could win a highly ranked Grandmaster chess-tournament. We add experiments that show a speedup of about 50 on 159 processors running on an SCI workstation cluster.
parallelprocessing is a vital tool for many scientific and industrial applications where real time constraints apply;in many applications the use of parallelprocessing and multiprocessor platforms seems to be the fa...
详细信息
ISBN:
(纸本)0780375963
parallelprocessing is a vital tool for many scientific and industrial applications where real time constraints apply;in many applications the use of parallelprocessing and multiprocessor platforms seems to be the favourable solution for achieving acceptable throughput. Hence parallelprocessingalgorithms are vital tools to achieve a good trade off between hardware cost, system efficiency and power. In this paper, the one-dimensional generalised parallel block filter algorithm based on the overlap-add approach is implemented on multi-DSPs platform. the mathematical concept of the input stage, output stage and the generalised direct filter equation are given. Also the I-D parallel algorithm is shown and a suitable parallel architecture is presented.
Biological sequence comparison is an important tool for researchers in molecular biology. there are several algorithms for sequence comparison. the Smith-Waterman algorithm, based on dynamic programming, is one of the...
详细信息
Service-based architectures enable the development of new classes of Grid and distributed applications. One of the main capabilities provided by Such systems is the dynamic and flexible integration of services, accord...
详细信息
ISBN:
(纸本)3540440496
Service-based architectures enable the development of new classes of Grid and distributed applications. One of the main capabilities provided by Such systems is the dynamic and flexible integration of services, according to which services are allowed to be a part of more than one distributed system and simultaneously serve different applications. this increased flexibility in system composition makes it difficult to address classical distributed system issues such as fault-tolerance. While it is relatively easy to make an individual service fault-tolerant, improving fault-tolerance of services collaborating in multiple application scenarios is a challenging task. In this paper, we look at the issue of developing fault-tolerant service-based distributed systems, and propose an infrastructure to implement fault tolerance capabilities transparent to services.
We propose an improved version of the CGS method for the solutions of large and sparse linear systems of equations with unsymmetric coefficient matrices. the proposed method combines elements of numerical stability an...
详细信息
ISBN:
(纸本)0769515126
We propose an improved version of the CGS method for the solutions of large and sparse linear systems of equations with unsymmetric coefficient matrices. the proposed method combines elements of numerical stability and parallel algorithm design without increasing computational costs. the algorithm is derived such that all matrix-vector multiplication, inner products and vector updates of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. therefore, the cost of global communication which represents the bottleneck of the performance can be significantly reduced. In this paper, the Bulk Synchronous parallel (BSP) model is used to design a fully efficient, scalable and portable parallel proposed algorithm and to provide accurate performance prediction of the algorithm for a wide range of architectures including the Cray T3D, the Parsytec, and a cluster of workstations connected by an Ethernet. this performance model uses only a few system dependent parameters based on a simple and accurate cost modelling to provide useful insight in the time complexity of the method. the theoretical performance prediction are compared with some preliminary measured timing results of a numerical application from ocean flow simulation.
In this paper, an improved version of the BiCGStab (IBiCGStab) method for the solutions of large and sparse linear systems of equations with unsymmetric coefficient matrices is proposed. the method combines elements o...
详细信息
ISBN:
(纸本)0769515126
In this paper, an improved version of the BiCGStab (IBiCGStab) method for the solutions of large and sparse linear systems of equations with unsymmetric coefficient matrices is proposed. the method combines elements of numerical stability and parallel algorithm design without increasing the computational costs. the algorithm is derived such that all inner products of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. therefore, the cost of global communication which represents the bottleneck of the parallel performance can be significantly reduced. the resulting IBiCGStab algorithm maintains the favorable properties of the original method while not increasing computational costs. Data distribution suitable for both irregularly and regularly structured matrices based on the analysis of the non-zero matrix elements is presented. Communication scheme is supported by overlapping execution of computation and communication to reduce waiting times. the efficiency of this method is demonstrated by numerical experimental results carried out on a massively parallel distributed memory system.
暂无评论