A methodology for distribution of a global climate model among computers connected by a wide area network is presented. The application consists of a model of the global atmosphere coupled to a model of the world ocea...
详细信息
A methodology for distribution of a global climate model among computers connected by a wide area network is presented. The application consists of a model of the global atmosphere coupled to a model of the world ocean. It is demonstrated that a `metacomputer' consisting of a CRAY Y-MP at the Jet Propulsion Laboratory and an Intel Delta at the California Institute of Technology connected by a high-speed (Gigabit per second) network can result in a superlinear speedup of execution of the atmospheric component of the global climate model code, despite the added overheads due to latency and communication delays.
Continuous monitoring of a computer network performance is probably the only solution which allows prompt identification of anomalous functioning conditions and knowledge of parameters on which to base speedy, effecti...
详细信息
Continuous monitoring of a computer network performance is probably the only solution which allows prompt identification of anomalous functioning conditions and knowledge of parameters on which to base speedy, effective recovery interventions. A number of tools for performance management already exist, but their effectiveness is limited as they are essentially inserted inside owner network management solutions. In the paper we describe the realization of PMt, a platform for the development of performance management applications for the control and real-time management of a heterogeneous computer network. PMt imposes an object-oriented view of the distributed system, defines a performance management design methodology which adapts well to really distributed system management and provides the user with a set of tools and services to assist him both in the design of new performance management applications and in actual management of the whole system.
The paper presents a process-replication protocol which aims at providing fault-tolerance as well as performance improvement to applications such as long-running and real-time tasks. Identical delivering order of mess...
详细信息
The paper presents a process-replication protocol which aims at providing fault-tolerance as well as performance improvement to applications such as long-running and real-time tasks. Identical delivering order of messages are enforced on all replicas of a troupe using multicasts for inter- and intra-troupe communication. Detailed design of the protocol is given in the paper. The protocol is self-contained in the sense that crashes in a troupe is handled internally without affecting the operation of other troupes. Crash-handling procedure is simple and associated overhead during fail-free operation is small. The protocol takes advantages of the redundancy of processes to expedite the completion of a distributed task by speeding up the determination of message sequences and transmission of outgoing data messages at the expense of small control messages. Simulation is carried out to show the performance improvement.
Delay testing continues to gain importance as manufacturers try to meet stricter shipped quality requirements for higher performance and higher density integrated circuits. The methodology to obtain a set of high qual...
详细信息
Delay testing continues to gain importance as manufacturers try to meet stricter shipped quality requirements for higher performance and higher density integrated circuits. The methodology to obtain a set of high quality gate delay fault detecting tests is unfortunately computationally intensive enough to be intractable for reasonably large VLSI circuits;parallelization of these computations is thus an attractive scenario. In this paper, we present, for the first time, distributed algorithms for gate delay fault simulation and fault coverage determination through test quality evaluation. These algorithms are implemented over a network of workstations, which is normally available at most design labs, and thus do not rely on the use of very specialized, expensive, or difficult-to-access hardware. These algorithms are theoretically analyzed, and experimental studies of their implementation are reported. The results conform to the theoretically predicted performance, with speedups of up to 10 being obtained with 15 workstations.
distributed applications spanning multiple nodes should be capable of providing fast response. Computation servers equipped with powerful processors and large memories can improve performance by better utilizing syste...
详细信息
distributed applications spanning multiple nodes should be capable of providing fast response. Computation servers equipped with powerful processors and large memories can improve performance by better utilizing system resources such as offloading overloaded nodes. A computation service may comprise all the nodes of the system (in case all are equipped with the required resources) or a small subset such as provided by a pool of computation servers (as in Amoeba). In both cases there can be a large number of service providers (compute servers). A server selection service is required to choose the most suitable server to provide the service. This paper is concerned with the design of adaptive computation server selection with scale recognized as a primary design and implementation factor. Adaptive system partitioning into domains is advocated as a key design principle for scalability. The model is demonstrated with compute-server selection using adaptive partitioning compared to the same service using random selection and probing.
A programming model that is widely approved today for large applications is parallel programming with shared variables. We propose an implementation of shared arrays on distributed memory architectures: it provides th...
详细信息
ISBN:
(纸本)0818675578
A programming model that is widely approved today for large applications is parallel programming with shared variables. We propose an implementation of shared arrays on distributed memory architectures: it provides the user with an uniform addressing scheme while being efficient thanks to a logical paging technique and optimized communication mechanisms.
In this paper a transmission line of finite length conductor with bend is *** expressions for the per-unit-length parameters of the line are derived by using non-uniform transmission line approach and are verified by ...
详细信息
ISBN:
(纸本)0780372778
In this paper a transmission line of finite length conductor with bend is *** expressions for the per-unit-length parameters of the line are derived by using non-uniform transmission line approach and are verified by the method of moment.
distributed Shared Memory (DSM) systems have been proposed to combine the programmability of traditional shared memory and the scalability of message-passing systems. Eager DSM systems can greatly reduce access latenc...
详细信息
distributed Shared Memory (DSM) systems have been proposed to combine the programmability of traditional shared memory and the scalability of message-passing systems. Eager DSM systems can greatly reduce access latencies for remote data by keeping copies of shared values in local memory and updating them immediately whenever a shared datum changes. However, sharing all changes globally can limit the system performance. It is usually possible to transform a program into an equivalent form that generates much less traffic, and therefore executes much more efficiently. This paper describes a compile-time analysis model for transforming simple shared memory programs with parallelized loop structures into programs that are optimized for efficient execution on eager DSM systems.
This paper discusses the impact of the hierarchical master-worker paradigm on performance of an application program, which solves an optimization problem by a parallel branch and bound algorithm on a distributed compu...
详细信息
ISBN:
(纸本)0769519199
This paper discusses the impact of the hierarchical master-worker paradigm on performance of an application program, which solves an optimization problem by a parallel branch and bound algorithm on a distributedcomputing system. The application program, which this paper addresses, solves the BMI Eigenvalue Problem, which is an optimization problem to minimize the greatest eigenvalue of a bilinear matrix function. This paper proposes a parallel branch and bound algorithm to solve the BMI Eigenvalue Problem with the hierarchical master-worker paradigm. The experimental results showed that the conventional algorithm with the master-worker paradigm significantly degraded peiformance on a Grid test bed, where computing resources were distributed on WAN via a firewall;however the hierarchical master-worker paradigm sustained good performance.
The proceedings contain 95 papers. The topics discussed include: a parallel FPT application for clusters;a synthesis of parallel out-of-core sorting programs on heterogeneous clusters*;noncontiguous I/O accesses throu...
ISBN:
(纸本)0769519199
The proceedings contain 95 papers. The topics discussed include: a parallel FPT application for clusters;a synthesis of parallel out-of-core sorting programs on heterogeneous clusters*;noncontiguous I/O accesses through MPI-IO;leveraging non-uniform resources for parallel query processing;a performance oriented migration framework for the grid*;scheduling distributed applications: the SimGrid simulation framework;fair share on highperformancecomputing systems: what does fair really mean?;distributedcomputing with hierarchical master-worker paradigm for parallel branch and bound algorithm;programming for dependability in a service-based grid;improving access to multi-dimensional self-describing scientific datasets*;and merging the CCA component model with the OGSI framework.
暂无评论