The authors have simulated several numerical and nonnumerical algorithms on five distributed-memory parallel processors (DMPPs). All five DMPPs have the same topology (a torus) and the same number of nodes. The archit...
详细信息
ISBN:
(纸本)9780897913195
The authors have simulated several numerical and nonnumerical algorithms on five distributed-memory parallel processors (DMPPs). All five DMPPs have the same topology (a torus) and the same number of nodes. The architectures differ only in the speed of communication between neighboring nodes, with the computation unit unchanged. The authors quantify the effect that interprocessor communication speed and synchronization overhead have on the performance of the DMPPs. After introducing their rationale and reviewing related work, the authors present and discuss the results of the simulations.
Skew in the distribution of values taken by an attribute is identified as a major factor that can affect the performance of parallelarchitectures for relational joins. The effect of skew on the performance of two par...
详细信息
ISBN:
(纸本)0818608935
Skew in the distribution of values taken by an attribute is identified as a major factor that can affect the performance of parallelarchitectures for relational joins. The effect of skew on the performance of two parallelarchitectures is evaluated using analytic models. In one architecture, called database machine (DBMC), data as well as processing power are distributed;while in the other architecture, called single processor parallel input/output (SPPI), data is distributed but the processing power is concentrated in one processor. The two architectures are compared in terms of the ratio of MIPS (millions of instructions per second) used by DBMC and SPPI to deliver the same throughput and response time. In addition, the horizontal growth potential of DBMC is evaluated in terms of maximum speedup achievable by DBMC relative to SPPI response time. The MIPS ratio as well as speedup are found to be very sensitive to the amount of skew. These suggest that careful thought should be given in parallelizing database applications and in the design of algorithms and query optimizer for parallelarchitectures.
Rule-based systems appear to be capable of exploiting large amounts of parallelism, because it is possible to match each rule to the data memory in parallel. It is pointed out that in practice the speedup from paralle...
详细信息
ISBN:
(纸本)081860719X
Rule-based systems appear to be capable of exploiting large amounts of parallelism, because it is possible to match each rule to the data memory in parallel. It is pointed out that in practice the speedup from parallelism is quite limited, less than 10-fold. The reasons for the small speedup are: (1) the small number of rules relevant to each change to data memory;(2) the large variation in the processing required by the relevant rules;and (3) the small number of changes made to data memory between synchronization steps. To obtain this limited factor of 10-fold speedup, it is necessary to exploit parallelism at a very fine granularity. It is suggested that a suitable architecture to exploit such fine-grain parallelism is a bus-based shared-memory multiprocessor with 32-64 processors. Using such a multiprocessor (with individual processors working at 2 MIPS), it is possible to obtain execution speeds of about 3800 rule-firings/s. This speed is significantly higher than that obtained by other proposed parallel implementations of rule-based systems.
An approach is proposed for modeling off-the-shelf hardware and for modeling parallelalgorithms, along with a design methodology to use the information provided by these models, to design a class of macro-pipelined s...
详细信息
ISBN:
(纸本)0818606347
An approach is proposed for modeling off-the-shelf hardware and for modeling parallelalgorithms, along with a design methodology to use the information provided by these models, to design a class of macro-pipelined special-purpose architectures. Nine parameters to form a model of the characteristics of parallel/distributed algorithms and the environment in which they must execute are presented. In addition, a set of tuples to model the characteristics of computer architectures is presented. By combining the tuples with the parameters, the execution time of the algorithm modeled by the parameters on the hardware modeled by the tuples can be approximated. The combination of these models could be used as a basis for computer-aided tools used in the design of macro-pipelined parallel/distributed processors.
In this paper we investigate the question of what is a good way to interconnect a large number of processors. Our main result is the construction of a universal parallel machine that can simulate every reasonable para...
详细信息
we present an algorithm that recognizes the class of General Series parallel digraphs and runs in time proportional to the size of its input. To perform this recognition task it is necessary to compute the transitive ...
详细信息
暂无评论