When a set of geographically distributed autonomous clusters of workstations are combined into a single large-scale virtual distributed system, a control mechanism for coordinating the activities of the combined syste...
详细信息
When a set of geographically distributed autonomous clusters of workstations are combined into a single large-scale virtual distributed system, a control mechanism for coordinating the activities of the combined system for effective utilisation of the resources is indispensable. This paper presents a co-ordination mechanism suitable for scheduling and distributing services across the host of such large-scale virtual distributed systems. The proposed scheme is scalable and reliable. Also, it combines both centralised and decentralised co-ordination mechanisms while eliminating/minimising their drawbacks.
This paper presents an algorithm for scheduling communication-intensive parallel applications in workstation clusters environment. The proposed scheduling policy combines the best attributes of both space-sharing and ...
详细信息
This paper presents an algorithm for scheduling communication-intensive parallel applications in workstation clusters environment. The proposed scheduling policy combines the best attributes of both space-sharing and time-sharing principles and coexists with local schedulers (e.g., the Windows NT scheduler), which both provides coordinated scheduling and can generalise to provide a wide range of resource abstractions.
We investigate instability and reversibility within hybrid Monte Carlo simulations using a nonperturbatively improved Wilson action. We demonstrate the onset of instability as tolerance parameters and molecular dynami...
We investigate instability and reversibility within hybrid Monte Carlo simulations using a nonperturbatively improved Wilson action. We demonstrate the onset of instability as tolerance parameters and molecular dynamics step sizes are varied. We compare these findings with theoretical expectations and present limits on simulation parameters within which a stable and reversible algorithm is obtained for physically relevant simulations. Results of optimization experiments with respect to tolerance parameters are also presented.
This paper describes the definition and implementation of an OpenMP-like set of directives and library routines for shared memory parallel programming in Java. A specification of the directives and routines is propose...
详细信息
ISBN:
(纸本)1581132883
This paper describes the definition and implementation of an OpenMP-like set of directives and library routines for shared memory parallel programming in Java. A specification of the directives and routines is proposed and discussed. A prototype implementation, JOMP, consisting of a compiler and a runtime library, both written entirely in Java, is presented, which implements a significant subset of the proposed specification.
Presents the results of the NRW-Metacomputing Taskforce, which has been working on the development of a (German) country-wide metacomputer since 1996. The resulting installation is among the very few that are already ...
详细信息
Presents the results of the NRW-Metacomputing Taskforce, which has been working on the development of a (German) country-wide metacomputer since 1996. The resulting installation is among the very few that are already operational, have full support for heterogeneous resources, contain a decent security model and feature an advanced scheduling subsystem for the metacomputing environment. The NRW-Metacomputer has been implemented using a modular software architecture. Hence, its concepts and components can be re-used by others without the need to obtain the metacomputing software as a whole. Furthermore, the NRW-Metacomputer already provides well-defined interfaces for linking the system with other metacomputing environments to form a truly global computational grid. Distinctive features of this system are its highly scalable and fault-tolerant software architecture, its advanced resource planning mechanisms, as well as an integration into a DCE (Distributed computing Environment)/DFS (Distributed File System) environment.
A comparison of the Buneman version of the block cyclic reduction (BCR) algorithm and Stride Reduction (BSR) based on polynomial factorization for separable elliptic equations with Dirichlet boundary conditions is pre...
详细信息
A comparison of the Buneman version of the block cyclic reduction (BCR) algorithm and Stride Reduction (BSR) based on polynomial factorization for separable elliptic equations with Dirichlet boundary conditions is presented. This study was initiated by an interest in the parallelcomputing techniques that can be used to increase the computational efficiency of these model problems.
The concept of software architecture has created a new scenario for incorporating non-functional and transactional requirements into the software design. Transactional and non-functional requirements can be included i...
详细信息
The current trend in HPC hardware is towards clusters of shared-memory (SMP) compute nodes. For applications developers the major question is how best to program these SMP clusters. To address this we study an algori...
ISBN:
(纸本)9780780398023
The current trend in HPC hardware is towards clusters of shared-memory (SMP) compute nodes. For applications developers the major question is how best to program these SMP clusters. To address this we study an algorithm from Discrete Element Modeling, parallelized using both the message-passing and shared-memory models simultaneously (“hybrid” parallelization). Thenatural load-balancing methods are different in the two parallel models, the shared-memory method being in principle more efficient for very load-imbalanced problems. It is therefore possible that hybrid parallelism will be beneficial on SMP clusters. We benchmark MPI and OpenMP implementations of the algorithm on MPP, SMP and cluster architectures, and evaluate the effectiveness of hybrid parallelism. Although we observe cases where OpenMP is more efficient than MPI on a single SMP node, we conclude that our current OpenMP implementation is not yet efficient enough for hybrid parallelism to outperform pure message-passing on an SMP cluster.
The persistence exponent, θ, is defined by NF∼t−θ, where t is the time since the start of the coarsening process and the “no-flip fraction,” NF, is the number of points that have not seen a change of “color” si...
The persistence exponent, θ, is defined by NF∼t−θ, where t is the time since the start of the coarsening process and the “no-flip fraction,” NF, is the number of points that have not seen a change of “color” since t=0. Here we investigate numerically the persistence exponent for a binary fluid system where the coarsening is dominated by hydrodynamic transport. We find that NF follows a power law decay (as opposed to exponential) with the value of θ somewhat dependent on the domain growth rate (L∼tα, where L is the average domain size), in the range θ=1.23±0.1 (α=2/3) to θ=1.37±0.2 (α=1). These α values correspond to the inertial and viscous hydrodynamic regimes, respectively.
The purpose of this paper is to present an algorithm for matrix multiplication based on a formula discovered by Pan [7]. For matrices of order up to 10 000, the nearly optimum tuning of the algorithm results in a rath...
详细信息
暂无评论