Space-efficient Information Dispersal Algorithm (IDA) [11] is applied to parallel communication in the hypercube. Let N denote the size of the network. Our communication scheme runs in 2·log N + 1 time using cons...
详细信息
ISBN:
(纸本)0897913701
Space-efficient Information Dispersal Algorithm (IDA) [11] is applied to parallel communication in the hypercube. Let N denote the size of the network. Our communication scheme runs in 2·log N + 1 time using constant size buffers. Its probability of successful routing is at least 1 - N-2.419·log N+1.5, proving Rabin's conjecture. The same scheme also tolerates O(N) random link failures with high probability. The scheme runs within the said time bound without long delay. On-line and efficient wire testing and replacement on the hypercube can be realized if our fault-tolerant routing scheme is used. Let α denote the total number of links in the hypercube. It is shown that ≈ α/352 wires can be disabled simultaneously without disrupting the ongoing computation or degrading the routing performance much.
Multithreading has been proposed as an architectural strategy for tolerating latency in multiprocessors and, through limited empirical studies, shown to offer promise. This paper develops an analytical model of multit...
详细信息
ISBN:
(纸本)0897913701
Multithreading has been proposed as an architectural strategy for tolerating latency in multiprocessors and, through limited empirical studies, shown to offer promise. This paper develops an analytical model of multithreaded processor behavior based on a small set of architectural and program parameters. The model gives rise to a large Markov chain, which is solved to obtain a formula for processor efficiency in terms of the number of threads per processor, the remote reference rate, the latency, and the cost of switching between threads. It is shown that a multithreaded processor exhibits three operating regimes: linear (efficiency is proportional to the number of threads), transition, and saturation (efficiency depends only on the remote reference rate and switch cost). Formulae for regime boundaries are derived. The model is embellished to reflect cache degradation due to multithreading, using an analytical model of cache behavior, demonstrating that returns diminish as the number threads becomes large. Predictions from the embellished model correlate well with published empirical measurements. Prescriptive use of the model under various scenarios indicates that multithreading is effective, but the number of useful threads per processor is fairly small.
Let P be a simple rectilinear convex polygon of size O(n) inside which lie n pairwise disjoint rectangular rectilinear obstacles. We provide parallel techniques for computing rectilinear shortest paths that avoid the ...
详细信息
ISBN:
(纸本)0897913701
Let P be a simple rectilinear convex polygon of size O(n) inside which lie n pairwise disjoint rectangular rectilinear obstacles. We provide parallel techniques for computing rectilinear shortest paths that avoid the set of obstacles in P. Specifically, we compute descriptions of shortest paths in O(log2 n) time, with O(n2/log2n) processors in the CREWPRAM model if source and destination are on the boundary of P, with O(n2/log n) processors if the source is an obstacle vertex and the destination a vertex of P, and with O(n2) processors if both source and destination are obstacle vertices. The descriptions we compute enable one processor to obtain the path length of any query pair of vertices in constant time, or O(n/log n) processors to retrieve the shortest path itself in logarithmic time. If the two query points are arbitrary rather than vertices, then one processor takes O(log n) time (instead of constant time) for finding the path length. A number of other related shortest paths problems are solved. The techniques we use involve a fast computation of separator staircases and, most importantly, a scheme for partitioning the obstacles' boundaries into families in a way that ensures that the resulting path length matrices have a monotonicity property that is apparently absent before applying our partitioning scheme.
The proceedings contain 45 papers. The topics discussed include: fast parallelalgorithms for the unit cost editing distance between trees;towards understanding exclusive read;on the parallel complexity of integer pro...
ISBN:
(纸本)089791323X
The proceedings contain 45 papers. The topics discussed include: fast parallelalgorithms for the unit cost editing distance between trees;towards understanding exclusive read;on the parallel complexity of integer programming;parallel RAMs with bounded memory wordsize;load balancing, selection and sorting on the hypercube;the power of parallel pointer manipulation;the communication complexity of several problems in matrix compntation;embedding of d-dimensional grids into optimal hypercubes;deterministic P-RAM simulation with constant redundancy;processor networks and interconnection networks without long wires;a lower bound on the size of shellsort sorting networks;cost-bandwidth tradeoffs for communication networks;a more practical PRAM model;the APRAM: incorporating asynchrony into the PRAM model;fault tolerance in hypercube-derivative networks;square meshes are not always optimal;parallel graph contraction;and the virtual time machine.
暂无评论