We discuss here the emergent Web based distributed environments for HPCC on the NII withthe focus on Java as an enabling technology. We start with a review of the past, presence and the near term future of the 'J...
详细信息
ISBN:
(纸本)0818675829
We discuss here the emergent Web based distributed environments for HPCC on the NII withthe focus on Java as an enabling technology. We start with a review of the past, presence and the near term future of the 'Java phenomenon', exposed here in the background of some related previous approaches towards a distributed interpretative virtual machine architecture.
Performance of I/O intensive applications on a multiprocessor system depends mostly on the variety of disk access delays encountered in the I/O system. Over the years, the improvement in disk performance has taken pla...
详细信息
ISBN:
(纸本)0818675829
Performance of I/O intensive applications on a multiprocessor system depends mostly on the variety of disk access delays encountered in the I/O system. Over the years, the improvement in disk performance has taken place slower than corresponding increase in processor speeds. It is therefore necessary to model I/O delays and evaluate performance benefits of moving an application to a better multiprocessor system. In this work, we perform such an analysis by measuring I/O delays for a synthesized application that uses paralleldistributed File System. the aim of this study was to evaluate the performance benefits of better disks in a multiprocessor system which was designed few years back. We report how the I/O performance would get affected if an application were to be run on a system which would have better disks and communication links. In this study, we show a substantial improvement in the performance of I/O system with better disks and communication links with respect to the existing system.
In distributed application domains where data change rapidly, it is often desirable for programs to obtain the latest available data values to achieve accurate computations. Example applications are financial services...
详细信息
ISBN:
(纸本)081864222X
In distributed application domains where data change rapidly, it is often desirable for programs to obtain the latest available data values to achieve accurate computations. Example applications are financial services and network management. Such data are logically shared by a network of programs. Unlike data in traditional databases, rapidly changing data are usually not lockable by (client) programs and it is crucial to the computations to access their values in a timely manner. In these application domains, a typical program usually performs computations based on recently available data values obtained from the network. However, these data values may be inconsistent or obsolete, since the real data are external to the system and may change more rapidly than can be reflected by their copies within the system. Decision making based on such inaccurate computations can lead to substantial penalties. In this paper, we propose an approach to delaying data value retrieval until needed in distributed programming, considering data and configuration change rapidly. this approach offers the advantage of obtaining more recent data values, resulting in more accurate computations and decision making.
We study the improvement in performance obtained in distributed memory machines through the use of a separate network that serves multiple I/O nodes operating under a distributed file system. For a hypercube architect...
详细信息
this paper discusses the effect of processor failures on computation performed on two-dimensional VLSI processor arrays. Previously established properties of catastrophic fault patterns are used to study inherent limi...
详细信息
A new programming style for large-scale parallel programs centered around distributed data structures has emerged. the current parallel program visualization tools were intended for the old style and do not deal with ...
详细信息
A new programming style for large-scale parallel programs centered around distributed data structures has emerged. the current parallel program visualization tools were intended for the old style and do not deal withdistributed data structures. We show, with several examples of visualizations and animations developed for large scale pC++ programs, that visualizing and animating distributed data structures is an important part of debugging and performance tuning for the new style parallel programs. Our approach is based on a new methodology for recording execution behavior that uses I/O abstractions and compile time source analysis and instrumentation.
Transparency, minimal interference, minimal residual dependencies, efficiency and robustness are some of the features that are felt necessary for process migration mechanisms in distributed operating systems. None of ...
详细信息
In parallel A* graph search on distributed-memory machines, different processors may perform significant duplicated work if inter-processor duplicates are not pruned. the only known method for duplicate pruning associ...
详细信息
ISBN:
(纸本)081864222X
In parallel A* graph search on distributed-memory machines, different processors may perform significant duplicated work if inter-processor duplicates are not pruned. the only known method for duplicate pruning associates a particular processor with each distinct node of the search space using a suitable hash function. then duplicate nodes arising in different processors are transmitted to the same processor, and thereby pruned. there are two main drawbacks attributable to such an approach: (1) Load balance is determined solely by the hash function and is unsatisfactory. (2) Node transmissions for duplicate pruning are global;this can lead to hot spots in the network. We propose two different duplicate pruning techniques that outperform this hashing-only method by using: (1) separate algorithms for duplicate pruning and load balancing, and (2) a novel search space partitioning scheme that evenly spreads out the bandwidth requirement for pruning over the entire parallel architecture. Using the Traveling Salesman Problem (TSP) as a test case, we find that on a 10-dimensional nCUBE2 hypercube multicomputer, our pruning strategies yield a speedup improvement of more than 135% over previous methods that do not prune any duplicates, and more than 155% over the hashing-only pruning scheme.
Independent checkpointing is a simple technique for providing fault tolerance in distributed systems. However, it can suffer from the domino effect, which causes the rollback of one process to potentially propagate to...
详细信息
ISBN:
(纸本)081864222X
Independent checkpointing is a simple technique for providing fault tolerance in distributed systems. However, it can suffer from the domino effect, which causes the rollback of one process to potentially propagate to others. In this paper we present an adaptive checkpointing algorithm to practically eliminate rollback propagation for independent checkpointing. Our algorithm is based on proofs of the conditions necessary and sufficient for a checkpoint to belong to some consistent global checkpoint, previously an open question. We characterize these conditions with a generalization of Lamport's happened-before relation called a zigzag path. Our algorithm tracks zigzag paths on-line and checkpoints when certain paths are detected. Experiments on an iPSC/860 hypercube show that our algorithm reduces the average rollback required to recover from any fault to less than one checkpoint interval per process, and checkpoints only 4% more often than traditional periodic checkpointing algorithms. We thus eliminate rollback propagation without the runtime overhead of coordinated checkpoints or other schemes that attempt to reduce rollback propagation.
暂无评论