Previous proposals for the application of discrete event methods to parallelization of sequential software have been based on the optimistic execution strategy. We present a new method which avoids optimistic executio...
详细信息
Previous proposals for the application of discrete event methods to parallelization of sequential software have been based on the optimistic execution strategy. We present a new method which avoids optimistic execution. This is motivated by the observation that the control structure of a sequential program constitutes a temporal coordinate system which is exogenous to the program execution. The method employs a logical time mechanism and provides adaptive synchronisation for the distributed execution. Hence data dependent and/or conditional parallelism is released without the risk of coherency violation. The paper begins with a brief introduction to the parallel discrete event simulation paradigm. The efficient coarse grain mapping of conventional programs onto this paradigm is then discussed.
In this paper, a modified version of previously proposed quasi-barrier technique is developed. On distributed memory machines, relaxation with modified quasi-barriers can be used to perform basis computations that ari...
详细信息
In this paper, a modified version of previously proposed quasi-barrier technique is developed. On distributed memory machines, relaxation with modified quasi-barriers can be used to perform basis computations that arise in symbolic polynomial manipulation. In this type of synchronous computation, the set of tasks is distributed across the processors. Each nonzero result of a task reduction dynamically generates a set of new tasks. The distribution of these newly generated tasks can have a significant impact on the overall execution time of the parallel computation. In this paper, four task distribution strategies, named modified block, modified sorted block, modified cyclic, and modified sorted cyclic are developed and their performances are comparatively evaluated. For the experiments performed on an 18-node IBM SP2, the modified cyclic distribution provides the best performance overall.
In this paper, we propose and analyze the new interconnection network for parallel computer, called graycube. The graycube has the same number of nodes and edges as hypercube, but it's diameter is about one half o...
详细信息
ISBN:
(纸本)0818682272
In this paper, we propose and analyze the new interconnection network for parallel computer, called graycube. The graycube has the same number of nodes and edges as hypercube, but it's diameter is about one half of the equivalent hypercube. It has simple recursive structure, routing and broadcasting algorithms. Since hypercube can be embedded into graycube with dialation 2, algorithms developed based on hypercube are easily simulated in graycube. The basic properties, routing and broadcasting algorithms, and hypercube embedding are presented.
CrownFS is a file system for continuous media server in the CROWN (Clustering Resources On Workstations' Network) system. Because of the high I/O bandwidth rate and strict real-time constraints, the design issues ...
详细信息
ISBN:
(纸本)0818682272
CrownFS is a file system for continuous media server in the CROWN (Clustering Resources On Workstations' Network) system. Because of the high I/O bandwidth rate and strict real-time constraints, the design issues of CrownFS are that of efficiently balancing user load against limited disk: bandwidth, network and I/O channel bandwidth. CrownFS accomplishes this balancing by striping continuous data file across several workstations and computing at each workstation in the CROWN system. This paper describes the CROWN system architecture and the CrownFS design. CrownFS runs on a cluster of workstations consisted of Kawaiis, AlphaStations and PCs.
In this paper, we explore in detail some properties of 2*-trees. A 2*-tree is a connected graph with N nodes, diameter log/sub 2/N, and degree of nodes 10. We propose routing algorithms for the unicasting, the broadca...
详细信息
In this paper, we explore in detail some properties of 2*-trees. A 2*-tree is a connected graph with N nodes, diameter log/sub 2/N, and degree of nodes 10. We propose routing algorithms for the unicasting, the broadcasting and the gossiping on a 2*-tree network. The routing algorithms we present have their time and communication complexities lower than the corresponding complexities for the routing on H/sub n/, i.e., the hypercube of dimension n=log/sub 2/N. Consequently, we think that a 2*-tree is an interesting topology to be considered for a network of processors.
We put forward a distributed model for accessing heterogeneous database systems. Database operations requested by the user are processed in a distributed manner that takes advantage of the inherent parallelism of dist...
详细信息
We put forward a distributed model for accessing heterogeneous database systems. Database operations requested by the user are processed in a distributed manner that takes advantage of the inherent parallelism of distributed systems, minimises network traffic and uses almost any general purpose computer on the network. Processing is not confined to DBMS sites but is provided as a distributed service. The design is modular and provides the mechanisms that may be arranged in a variety of ways providing a range of integration paradigms from loosely coupled integrations to more tightly coupled integrations.
This paper presents a new dynamic load balancing algorithm for hypercube multicomputers with faulty nodes. The emphasis in our method is on obtaining global load information and performing task migration using "s...
详细信息
ISBN:
(纸本)0818682272
This paper presents a new dynamic load balancing algorithm for hypercube multicomputers with faulty nodes. The emphasis in our method is on obtaining global load information and performing task migration using "short paths" in a synchronous manner so that a minimal amount of communication overhead is required. To accomplish this, we present an algorithm for constructing a new logical topology from a hypercube topology with faulty nodes. This new topology is used to obtain the global load information and to perform task migration. Simulation results are used to evaluate the performance of our dynamic load balancing method. The proposed strategy shows good performance in the case of a small number of faulty nodes when compared with previous methods.
We present a simple and efficient algorithm for the nearest smallers problem (NSP) on a distributed shared memory (DSM) system with applications to problems from diverse areas. We adopt the block distributed memory (B...
详细信息
We present a simple and efficient algorithm for the nearest smallers problem (NSP) on a distributed shared memory (DSM) system with applications to problems from diverse areas. We adopt the block distributed memory (BDM) model of computation as described in [2]. To the best of our knowledge this is the first known algorithm for the NSP on DSM systems. Since the NSP is fundamental in many problems, a solution for it on DSM systems implies DSM-based solutions for a variety of problems in diverse areas as discussed in this paper. parallel algorithms known so far for the NSP are based on shared memory systems and are therefore less scalable than our algorithm.
The paper presents solutions for an element of the cluster that will be primarily used as a scalable storage product for a collection of mainframes. Existing storage solutions either employ server attached disks: wher...
详细信息
The paper presents solutions for an element of the cluster that will be primarily used as a scalable storage product for a collection of mainframes. Existing storage solutions either employ server attached disks: where the problem is the number of I/O slots, or are specially designed products where the problem is the lack of standards. The essence of both proposed solutions is the introduction of the file usage locality concept, together with methods and techniques to exploit it. The first solution is based on the standard SMP architecture, while the second one employs the DSM architecture as the basic building block of a cluster product.
In group communications, larger computation and communication overhead are considered to causally order all the messages transmitted in the network. Transactions in clients manipulate objects in servers by sending rea...
详细信息
In group communications, larger computation and communication overhead are considered to causally order all the messages transmitted in the network. Transactions in clients manipulate objects in servers by sending read and write requests to the servers. In this paper, we define significant messages by using the relation among the transactions. We newly propose an object vector to causally order only the significant messages. The scheme of the object vector is invariant in the change of the group membership. We also show a TBCO (transaction-based causally ordered) protocol which adopts the object vector, by which the number of messages to be causally ordered are reduced.
暂无评论