Load balancing has been extensive study by simulation, positive results were received in most of the researches. With the increase of the availability of distributedsystems, a few experiments have been carried out on...
详细信息
ISBN:
(纸本)0818675829
Load balancing has been extensive study by simulation, positive results were received in most of the researches. With the increase of the availability of distributedsystems, a few experiments have been carried out on different systems. These experimental studies either depend on task initiation or task initiation plus task migration. In this paper, we present the results of an 0 study of load balancing using a centralized policy to manage the load on a set of processors, which was carried out on an Amoeba system which consists of a set of 386s and linked by 10Mbps Ethernet. The results on one hand indicate the necessity of a load balancing facility for a distributed system. On the other hand, the results question the impact of using process migration to increase system performance under the configuration used in our experiments.
We present a new conservative event-synchronization protocol, time-based synchronization, for parallel discrete-event simulation of mobile ad hoc wireless networks. Simulators that use our protocol proceed at a scaled...
详细信息
ISBN:
(纸本)0769520367
We present a new conservative event-synchronization protocol, time-based synchronization, for parallel discrete-event simulation of mobile ad hoc wireless networks. Simulators that use our protocol proceed at a scaled version of real time and send messages that correspond only to transmissions in the simulated network. We show that such simulators can maintain a constant execution time even as the sizes of the networks that they simulate grow. Moreover we show that these simulators, when executed on a custom parallel architecture, are capable of simulating many networks faster than real time.
Consistent hashing can be used to assign objects to nodes in a distributed system. It has been used by several distributedsystems including Chord, Pastry, and Tornado because of its efficient handling of node failure...
详细信息
ISBN:
(纸本)0769522106
Consistent hashing can be used to assign objects to nodes in a distributed system. It has been used by several distributedsystems including Chord, Pastry, and Tornado because of its efficient handling of node failure and repair. In this paper we analyze how well consistent hashing does at evenly distributing objects among the nodes in the system. We also extend current consistent hashing algorithms to allow for dynamic load balancing while retaining the good properties of consistent hashing. Finally we analyze Our extensions using both probabilistic analysis and simulations. The algorithms derived appear to achieve much better load balancing.
An examination is made of heuristic algorithms for processing distributed queries using generalized joins. As this optimization problem is NP-hard, a heuristic algorithm is used to form/formulate strategies to process...
详细信息
ISBN:
(纸本)0818608935
An examination is made of heuristic algorithms for processing distributed queries using generalized joins. As this optimization problem is NP-hard, a heuristic algorithm is used to form/formulate strategies to process queries. It has a special property in that its overhead can be controlled. The higher its overhead the better the strategies it produces. Modeling on a testbed of queries demonstrates that there is a tradeoff between the strategy's execution and formulation delays. The modeling results also support the notion that simple greedy heuristic algorithms are sufficient in that they are likely to lead to near-optimal strategies and that increasing the overhead in forming strategies is only marginally beneficial. Both the strategy formulation and execution delays are examined in relation to the number of operations specified by the strategy and the total size of partial results.
Sparse matrix problems require a communication paradigm different from those used in conventional distributed-memory multiprocessors. We present in this gaper how fine-grain communication can help obtain high performa...
详细信息
ISBN:
(纸本)0818677937
Sparse matrix problems require a communication paradigm different from those used in conventional distributed-memory multiprocessors. We present in this gaper how fine-grain communication can help obtain high performance in the experimental distributed-memory multiprocessor, EM-X, developed at ETL, which can handle fine-grain communication very efficiently. The sparse matrix: kernel, Conjugate Gradient, is selected for the experiments. Among the steps in CG is the sparse matrix vector multiplications we focus on in the study. Some communication methods are developed for performance comparison, including coarse-grain and fine-grain implementations, Fine-grain communication allows exact data access in an unstructured problem to reduce the amount of communication. While CG presents bottlenecks in terms of a large number of fine-grain remote reads, the multi-thraded principles of execution is so designed to tolerate such latency. Experimental results indicate that the performance of fine-grain read implementation is comparable to that of coarse-grain implementation on 64 processors. The results demonstrate that fine-grain communication can be a viable and efficient approach to unstructured sparse matrix problems on large-scale distributed-memory multiprocessors.
Database schema integration is significant not only in building multidatabase systems but also in data warehousing. Metadata, which define schemas, are normally involved in the surrounding issues. And while many of th...
详细信息
ISBN:
(纸本)0769508197
Database schema integration is significant not only in building multidatabase systems but also in data warehousing. Metadata, which define schemas, are normally involved in the surrounding issues. And while many of these issues have been addressed in the past, unresolved issues remain. In this paper, we present an approach that not only uses metadata but also uses meta-data information to make schema integration more possible. Our solution requires meta object facility that serves not only as a repository but also as a more feasible means of managing meta data. We also advocate the use of such a facility as part of an object-orientated middleware environment that provides an open interface standard and several useful services in distributed object management.
Several variants of parallel multipole-based algorithms have been implemented to further research in fields such as computational chemistry and astrophysics. We present a distributedparallel implementation of a multi...
详细信息
Several variants of parallel multipole-based algorithms have been implemented to further research in fields such as computational chemistry and astrophysics. We present a distributedparallel implementation of a multipole-based algorithm that is portable to a wide variety of applications and parallel platforms. Performance data are presented for loosely coupled networks of workstations as well as for more tightly coupled distributed multiprocessors, demonstrating the portability and scalability of the application to large number of processors.
A dominant cost for query evaluation in modern massively distributedsystems is the number of communication rounds. For this reason, there is a growing interest in single-round multiway join algorithms where data is f...
详细信息
ISBN:
(纸本)9781450327572
A dominant cost for query evaluation in modern massively distributedsystems is the number of communication rounds. For this reason, there is a growing interest in single-round multiway join algorithms where data is first reshuffled over many servers and then evaluated in a parallel but communication-free way. The reshuffling itself is specified as a distribution policy. We introduce a correctness condition, called parallel-correctness, for the evaluation of queries w.r.t. a distribution policy. We study the complexity of parallel-correctness for conjunctive queries as well as transferability of parallel-correctness between queries. We also investigate the complexity of transferability for certain families of distribution policies, including, for instance, the Hypercube distribution.
This paper describes optimization techniques for translating out-of-core programs written in a data parallel language to message passing node programs with explicit parallel I/O. We demonstrate that straightforward ex...
详细信息
ISBN:
(纸本)0818677937
This paper describes optimization techniques for translating out-of-core programs written in a data parallel language to message passing node programs with explicit parallel I/O. We demonstrate that straightforward extension of in-core compilation techniques does not work well for out-of-core programs. I-ire then describe how the compiler can optimize the code by (I) determining appropriate file layouts for our-of-core arrays, (2) permuting the loops in the nest(s) to allow efficient file access, and (3) partitioning the available node memory among references based on VO cost estimation. Our experimental results indicate that these optimizations can reduce the amount of time spent in I/O by as much as an order of magnitude.
Predicting the running time of a parallel program is useful for determining the optimal values for the parameters of the implementation and the optimal mapping of data on processors. However deriving an explicit formu...
详细信息
ISBN:
(纸本)0818684038
Predicting the running time of a parallel program is useful for determining the optimal values for the parameters of the implementation and the optimal mapping of data on processors. However deriving an explicit formula for the running time of a certain parallel program is a difficult task. We present a new method for the analysis of parallel programs: simulating the execution of parallel programs by following their control flow and by determining, for each processor the sequence of send and receive operations according to the LogGP model. We developed two algorithms to simulate the LogGP communication between processors and we tested them on the blocked parallel version of the Gaussian Elimination algorithm on the Meiko CS-2 parallel machine. Our implementation showed that the LogGP simulation is able to detect the nonlinear behavior of the program running times, to indicate the differences in running times for different data layouts and to find the local optimal value of the block size with acceptable precision.
暂无评论