The Portable parallel/distributed Debugger project at the NASA Ames Research Center has built a debugger for applications running on heterogeneous computational grids. It employs a client-server architecture to simpli...
详细信息
The Portable parallel/distributed Debugger project at the NASA Ames Research Center has built a debugger for applications running on heterogeneous computational grids. It employs a client-server architecture to simplify the implementation, and its user interface has been designed to provide process control and state examination functions on computations with a large number of processes. The debugger can find processes participating in distributed computations even when those processes were not created under debugger control. In addition to working in a computational grid environment, these techniques also work on other distributed memory jobs, such as those initiated by "mpirun".
Introduces and evaluates a new efficient dynamic load-balancing scheme for parallel molecular dynamics simulation on distributed memory machines. It decomposes a spatial domain of particles into disjoint parts, each o...
详细信息
Introduces and evaluates a new efficient dynamic load-balancing scheme for parallel molecular dynamics simulation on distributed memory machines. It decomposes a spatial domain of particles into disjoint parts, each of which corresponds with a processor and dynamically changes its shape to keep almost the same number of particles throughout simulation. In contrast to other similar schemes, ours requires no long-distance inter-processor communications but only those among adjacent processors (and, thus, little communication overhead), whereas it still guarantees fast reduction of load imbalance among the processors. It owes these advantages mainly to the following features. (1) The sufficiently correct global load information is effectively obtained with the stepwise propagation of appropriate information via nearest-neighbor communication. (2) In addition to the global load balancing, another load-balancing procedure is also invoked on each processor without global load information in order to suppress the rapid increase or decrease of loads. Thus, information from remote processors can provide reliable values even after a certain period of delay. To evaluate the effectiveness of our scheme, we have integrated our load balancer into the publicly-available NAMD simulation system, through replacing its built-in load-balancing component. Preliminary experiments on a cluster of workstations connected through Myrinet switches shows that it successfully reduces load imbalance and improves the simulation performance.
In this paper, we examine various modeling and simulation applications of cluster computing using a Beowulf cluster. These applications are used to investigate the performance of our cluster in terms of computational ...
ISBN:
(纸本)9780769508375
In this paper, we examine various modeling and simulation applications of cluster computing using a Beowulf cluster. These applications are used to investigate the performance of our cluster in terms of computational speedup, scalability, and communications. The applications include solution of linear systems by Jacobi iteration, distributed image generation, and the finite difference time domain solution of Maxwell's equations. It is observed that the computational load for these applications must be large compared to the communication overhead to take advantage of the speedup obtained using parallel computing. For the applications reviewed here, this condition is increasingly satisfied as the problem size becomes larger or as higher resolution is required.
Computational Grids have become an important and popular computing platform for both scientific and commercial distributed computing communities. However, users of such systems typically find achievement of applicatio...
详细信息
Computational Grids have become an important and popular computing platform for both scientific and commercial distributed computing communities. However, users of such systems typically find achievement of application execution performance remains challenging. Although Grid infrastructures such as Legion and Globus provide basic resource selection functionality, work allocation functionality, and scheduling mechanisms, applications must interpret system performance information in terms of their own requirements in order to develop performance-efficient schedules. We describe a new high-performance scheduler that incorporates dynamic system information, application requirements, and a detailed performance model in order to create performance efficient schedules. While the scheduler is designed to provide improved performance for a magneto hydrodynamics simulation in the Legion Computational Grid infrastructure, the design is generalizable to other systems and other data-parallel, iterative codes. We describe the adaptive performance model, resource selection strategies, and scheduling policies employed by the scheduler. We demonstrate the improvement in application performance achieved by the scheduler in dedicated and shared Legion environments.
Power electronic systems are described by nonlinear differential equations, typically of high order. simulation of such systems increasingly requires the high speeds available only on multiprocessor computing systems....
详细信息
ISBN:
(纸本)0780365615
Power electronic systems are described by nonlinear differential equations, typically of high order. simulation of such systems increasingly requires the high speeds available only on multiprocessor computing systems. However, the traditional methods for partitioning the system are rather inconvenient, primarily because of the additional burden created by the current operation associated with distributed systems. The concepts and methods of structural modeling provide a way to overcome this difficulty. The essence of structural modeling is the assignment of some specific computing resource to each object of the simulated physical system. The simulation of each object then operates as a formal independent procedure, and interactions between objects are implemented on a data link level. This allows parallel execution of the separate procedures on the allocated multiprocessor computing hardware. This provides two advantages: (1) a decrease in simulation time, and (2) simplification of the programming process.
We consider a quantum computational algorithm that can be used to determine (probabilistically) how close a given signal is to one of a set of previously observed signals stored in the state of a quantum neurocomputio...
详细信息
We consider a quantum computational algorithm that can be used to determine (probabilistically) how close a given signal is to one of a set of previously observed signals stored in the state of a quantum neurocomputional machine. The realization of a new quantum algorithm for factorization of integers by Shor and its implication to cryptography has created a rapidly growing field of investigation. Although no physical realization of a quantum computer is available, a number of software systems simulating a quantum computation process exist. In light of the rapidly increasing power of desktop computers and their ability to carry out these simulations, it is worthwhile to investigate possible advantages as well as realizations of quantum algorithms in signal processing applications. The algorithm presented offers a glimpse of the potential of this approach. Neural networks (NN) provide a natural paradigm for parallel and distributed processing of a wide class of signals. Neural networks within the context of classical computation have been used for approximation and classification tasks with some success. We propose a model for quantum neurocomputation (QN) and explore some of its properties and potential applications to signal processing in an information theoretic context.
This paper introduces a hybrid Associative memory/SIMD parallel processor, APPLES, which has been specifically designed for logic simulation. Its reviews the computational structure which permits parallel execution of...
详细信息
ISBN:
(纸本)0769500595
This paper introduces a hybrid Associative memory/SIMD parallel processor, APPLES, which has been specifically designed for logic simulation. Its reviews the computational structure which permits parallel execution of logic gate evaluations in memory. This facilitates fine grain execution on a massive scale of the basic tasks inherent in VLSI logic simulation. Furthermore, unlike of her SIMD approaches the simulation is not limited to a unit delay model, complex delays such as inertial delays are permissible. The processor has been implemented in Verilog and assessed using ISCAS-85 benchmarks. Gate evaluation is executed in constant time, whereas updating fan-out lists expands with circuit size. However, the APPLES architecture enables this latter task to be parallelised subject to various system parameters. The most important constraint is identified as the fan-out memory access time relative to the scan rate of the associative memory.
The simulation of incompressible fluids is one of the important problem classes in computational fluid dynamics. We consider a simulation algorithm for the convection in binary fluid mixtures, a problem where a quite ...
详细信息
ISBN:
(纸本)0769500595
The simulation of incompressible fluids is one of the important problem classes in computational fluid dynamics. We consider a simulation algorithm for the convection in binary fluid mixtures, a problem where a quite simple model describes a very complex behavior: In a parallel implementation on an IBM SP2, we investigate several implementation strategies involving different data layouts and communication organizations.
作者:
Rosato, VPucello, NENEA
HPCN Project Ente Nuove Tecnol Energia & Ambiente I-00100 Rome Italy
A code for Me simulation of X-ray diffraction pattern of a powder has been implemented on a massively parallel SIMD platform developed in the frame of the PQE2000 Project. The code allows the evaluation of the diffrac...
详细信息
ISBN:
(纸本)0769500595
A code for Me simulation of X-ray diffraction pattern of a powder has been implemented on a massively parallel SIMD platform developed in the frame of the PQE2000 Project. The code allows the evaluation of the diffraction pattern of atomic-scale models of both perfectly ordered and disordered structures. The code has been wed to investigate the structures resulting from the non-equilibrium alloying process of an immiscible metallic couple (Ag-Cu).
暂无评论