For an n-gon P, we say P is weakly visible from segment s if any point on P is visible from at least one point of the segment. In this paper, we present an optimal preprocessing algorithm which runs in O(log n) time u...
详细信息
The task or precedence graph formalism is a practical tool to study algorithmparallelization. Redundancy in such task graphs gives rise to numerous avoidable inter-task dependencies which invariably complicates the p...
详细信息
The task or precedence graph formalism is a practical tool to study algorithmparallelization. Redundancy in such task graphs gives rise to numerous avoidable inter-task dependencies which invariably complicates the process of parallelization. In this paper we present an O(1) time algorithm for the elimination of redundancy in such graphs on Processor Arrays with Reconfigurable Bus Systemusing O(n 4 ) processors, The previous parallel algorithm available in the literature for redundancy elimination in task graphs takes O(n 2 ) time using O(n) processors.
This short paper presents a novel pipelining and processor allocation strategy for monoid computations on an unshuffle-exchange network. In the strategy, the processor utilization is near 1 and the communication is co...
详细信息
This short paper presents a novel pipelining and processor allocation strategy for monoid computations on an unshuffle-exchange network. In the strategy, the processor utilization is near 1 and the communication is collision-free. With the characteristics of constant connections to each processor and only a single output node on the network, the method given here can compete with the method of Barnard and Skillicorn based on a hypercube network with multiple output nodes.
Many efficient systolic algorithms in block computation of digital filters have been *** highly concurrent structures can be implemented from these block systolic computation *** the performances of these highly concu...
详细信息
Many efficient systolic algorithms in block computation of digital filters have been *** highly concurrent structures can be implemented from these block systolic computation *** the performances of these highly concurrent structures would be described with the total computation times and the total processors numbers needed for each algorithms.A comparision with the sequential algorithms using the tables of speedup rate and efficiency for each block algorithm were summarized.
In this paper, we propose a parallel algorithm for the traffic control problem (an NP-complete problem) on crossbar switch networks. This problem is to find a set of conflict-free paths such that the maximum number of...
详细信息
In this paper, we propose a parallel algorithm for the traffic control problem (an NP-complete problem) on crossbar switch networks. This problem is to find a set of conflict-free paths such that the maximum number of message packets can be transmitted over the network. The problem can be represented by an energy function. Then by applying our parallel algorithm, the state of the energy function is iteratively updated toward a stable state. When the energy function reaches a stable state, the state represents a solution of the problem. The empirical results show that the throughputs of the proposed algorithm are much better than the linear algorithm. We have shown that the time complexity of a parallel algorithm is O(n) by using n 2 processors. Furthermore, since the traffic control problem can be reduced to the traveling salesman problem, the proposed algorithm can be further applied to some other NP-complete problems.
A parallel algorithm for computing the transient response of structures is presented. The computation is parallelized on the basis of a division of the structure into substructures, with each processor computing the r...
详细信息
A parallel algorithm for computing the transient response of structures is presented. The computation is parallelized on the basis of a division of the structure into substructures, with each processor computing the response of a substructure independently. Independently computed substructure responses are reconciled directly, rather than iteratively, to obtain the solution of the global problem. The only data required for correcting the independently computed response of a substructure is the interface motion computed independently for other substructures. Reconciliation of substructure responses is not required after every time step;instead, it can be postponed until after the responses have been computed independently for multiple time steps. A numerical example is presented that demonstrates the method and its accuracy.
This paper reports: 1) parallelization of the two best known sequential algorithms (Dotson & Gobein, and Page & Perry PP-F2TDN) for computing the terminal-pair reliability in a network;2) Reduce&Partition ...
详细信息
This paper reports: 1) parallelization of the two best known sequential algorithms (Dotson & Gobein, and Page & Perry PP-F2TDN) for computing the terminal-pair reliability in a network;2) Reduce&Partition (R&P), a new sequential algorithm which combines the best efficient features of these two algorithms. On published benchmark networks, R&P runs almost twice as fast as the previously known fastest algorithm. A parallel version of R&P is also presented. The execution times of all three parallel algorithms with various numbers of processors for different networks on the BBN Butterfly parallel computer are provided. R&P is both fast and parallelizable. The recursive algorithms require memory O(#vertices 2 #edges), as the recursion depth is limited to (#edges) and at each recursive node a O(#vertices 2) memory is used to represent the network. Thus, the memory requirement of R&P is approximately the same as that of PP-F2TDN and much less than that of the non-recursive Dotson & Gobein algorithm. All 3 algorithms compute exact numerical reliability, but they can easily be modified to produce symbolic reliability expressions. The parallel algorithms were implemented on a shared-memory parallel computer. The R&P approach should be explored to solve other network reliability problem, such as K-terminal reliability. In R&P, the greedy approach was used in selecting shortest paths in order to locally-minimize the number of sub-problems. This selection did not consider the effect of reductions on the subproblems to be generated.
We present parallel algorithms for some fundamental problems in computational geometry which have a running time of O(log n) using n processors, with very high probability (approaching 1 as n --> infinity). These i...
详细信息
We present parallel algorithms for some fundamental problems in computational geometry which have a running time of O(log n) using n processors, with very high probability (approaching 1 as n --> infinity). These include planar-point location, triangulation, and trapezoidal decomposition. We also present optimal algorithms for three-dimensional maxima and two-set dominance counting by an application of integer sorting. Most of these algorithms run on a CREW PRAM model and have optimal processor-time product which improve on the previously best-known algorithms of Atallah and Goodrich [5] for these problems. The crux of these algorithms is a useful data structure which emulates the plane-sweeping paradigm used for sequential algorithms. We extend some of the techniques used by Reischuk [26] and Reif and Valiant [25] for flashsort algorithm to perform divide and conquer in a plane very efficiently leading to the improved performance by our approach.
A parallel cost-optimal algorithm to compute the supremum of max-min powers of any map (graph) is obtained using the EREW SM SIMD computer as the model of computation. The run-time of the algorithm is O(n) using n pro...
详细信息
A parallel cost-optimal algorithm to compute the supremum of max-min powers of any map (graph) is obtained using the EREW SM SIMD computer as the model of computation. The run-time of the algorithm is O(n) using n processors where n is the number of elements in the underlying set (the vertices of the graph).
In the paper [2] a randomized algorithm of Petford and Welsh [1] was parallelized and optimal speedup as well as linear time complexity was reported. Later, an error was found in the simulation program when the experi...
详细信息
In the paper [2] a randomized algorithm of Petford and Welsh [1] was parallelized and optimal speedup as well as linear time complexity was reported. Later, an error was found in the simulation program when the experiments were repeated independently. Consequently, Table 1 of [2] is incorrect. Here we explain how the parallelization has to be modified to get essentially (up to a constant factor) the same results: the linear average time complexity of the parallel variant of the algorithm and optimal speedup. In the first section we give the corrected algorithm. An example which gives some idea of what kind of problems can arise in a straightforward implementation follows. We conclude with experimental results of the corrected algorithm.
暂无评论