For an arbitrary nxn matrix A and an nx1 column vector b, we present a systolic algorithm to solve the dense linear equations Ax=b. An important consideration is that the pivot row can be changed during the execution ...
详细信息
For an arbitrary nxn matrix A and an nx1 column vector b, we present a systolic algorithm to solve the dense linear equations Ax=b. An important consideration is that the pivot row can be changed during the execution of our systolic algorithm. The computational model consists of n linear systolic arrays. For 1 less than or equal to i less than or equal to n, the i(th) linear array:is responsible to eliminate the i(th) unknown variable x(i) of x. This algorithm requires 4n time steps to solve the linear system. The elapsed time unit within a time step is independent of the problem size n. Since the structure of a PE is simple and the same type PE executes the identical instructions, it is very suitable for VLSI implementation. The design process and correctness proof are considered in detail. Moreover,this algorithm can detect whether A is singular or not.
With the advent of VLSI it has become possible to map parallel algorithms for compute-bound problems directly on silicon. systolic architecture is very good candidate for VLSI implementation because of its regular and...
详细信息
With the advent of VLSI it has become possible to map parallel algorithms for compute-bound problems directly on silicon. systolic architecture is very good candidate for VLSI implementation because of its regular and simple design, and regular communication pattern. In this paper, a systolic algorithm and corresponding systolic architecture, a linear systolic array, for the scanline-based hidden surface removal problem in three-dimensional computer graphics have been proposed. The algorithm is based on the concept of sample spans or intervals. The worst case time taken by the algorithm is O(n), n being the number of segments in a scanline. The time taken by the algorithm for a given scene depends on the scene itself, and on an average considerable improvement over the worst case behaviour is expected. A pipeline scheme for handling the I/O process has also been proposed which is suitable for VLSI implementation of the algorithm.
This paper describes a systolic algorithm based on tensor products of basis functions for multivariable approximation. Some implementation issues are also discussed.
This paper describes a systolic algorithm based on tensor products of basis functions for multivariable approximation. Some implementation issues are also discussed.
This paper describes a systolic algorithm for rational interpolation based on Thiele's reciprocal differences. This algorithm constructs the continued fraction/Pade approximant of a stream of input data using a li...
详细信息
This paper describes a systolic algorithm for rational interpolation based on Thiele's reciprocal differences. This algorithm constructs the continued fraction/Pade approximant of a stream of input data using a linear array of processors. The period of this algorithm is O(n + 1) (where n + 1 is the number of distinct points at which the function values are available) to produce an M/M Pade approximant (M = n + 1/2, n odd;M = n/2, n even) using n + 1 processors. For illustrative purpose the Connection Machine implementation of this systolic algorithm in CM Fortran is presented with an example.
For arbitraryn×nmatrixAandn×mmatrixB, a systolic algorithm to solve the linear systemsAX=Bis presented. The cmputational model used consists ofn(n+1) PEs. The number of PEs used is independent ofm. This algo...
详细信息
For arbitraryn×nmatrixAandn×mmatrixB, a systolic algorithm to solve the linear systemsAX=Bis presented. The cmputational model used consists ofn(n+1) PEs. The number of PEs used is independent ofm. This algorithm requires 4n+m–2 time steps to solve the linear systems. Since the structure of PE is simple and the PE with same type executes the identical program, it is very suitable for VLSI implementation. Moreover, ifmis a singular matrix, it can be detected during the execution of the systolic algorithm.
A new fully-pipelined systolic algorithm for finding all the bridges of an undirected connected graph is proposed. Given a graph of n vertices and m edges, the proposed algorithm uses (2n - 2) systolic cells and runs ...
详细信息
A new fully-pipelined systolic algorithm for finding all the bridges of an undirected connected graph is proposed. Given a graph of n vertices and m edges, the proposed algorithm uses (2n - 2) systolic cells and runs in (m + 3n - 3) systolic cycles. This improves a previous result. The use of fully-pipelined cells and the uniformity of the operations in each cell make the proposed algorithm distinctive.
作者:
TSAY, JCLEE, WP[a]College of Engineering
National Chiao Tung University Institute of Computer Science and Information Engineering Hsinchu Taiwan 30050 Republic of China
Subset generation, generating all 2'' - 1 subsets of an n-element set, is frequently necessary in combinatorial algorithms. In this paper, we shall utilize Moldovan's space-time mapping methodology to desi...
详细信息
Subset generation, generating all 2'' - 1 subsets of an n-element set, is frequently necessary in combinatorial algorithms. In this paper, we shall utilize Moldovan's space-time mapping methodology to design a systolic algorithm for the subset generation problem. The algorithm is a cost-optimal design and can generate all subsets in lexicographic order. Since it can be run on a simple linear systolic array, it is amenable to VLSI implementation.
A linear systolic algorithm is proposed for the connected component problem. Given an undirected graph withn vertices,m edges, andp connected components, the proposed algorithm uses (n ?p + 1) cells pipelined together...
详细信息
A linear systolic algorithm is proposed for the connected component problem. Given an undirected graph withn vertices,m edges, andp connected components, the proposed algorithm uses (n ?p + 1) cells pipelined together to figure out, in (m + 3(n ?p) + 2) systolic cycles, which component each vertex belongs to.
A new systolic algorithm for finding bridges in an n -node connected graph is proposed in this paper. The algorithm is executable on a n × n array of processing elements. It is shown that our algorithm is an impr...
详细信息
A new systolic algorithm for finding bridges in an n -node connected graph is proposed in this paper. The algorithm is executable on a n × n array of processing elements. It is shown that our algorithm is an improvement over the previously known algorithm for an n -node connected graph on a n × n array of processing elements, as proposed by Attalah and Kosaraju. The improvement both in area and time complexity is significant. Moreover, the control structure required for implementing the proposed algorithm on a systolic array is simpler.
The problem of longest common subsequence is defined as finding the longest common subsequence of two input sequences. It can be employed in many fields such as speech and signal processing, data compression, syntacti...
详细信息
ISBN:
(纸本)9781424420957
The problem of longest common subsequence is defined as finding the longest common subsequence of two input sequences. It can be employed in many fields such as speech and signal processing, data compression, syntactic pattern recognition, string processing, and genetic engineering. Usually, lengths of both input sequences are very long, resulting in long processing time. Many previous algorithms have been proposed to reduce the processing time. Previous systolic algorithms can achieve linear time complexity with linear number of processing elements. However, each processing element needs linear memory-space to store intermediate results and then the system needs too large memory-space in total, resulting in being not suitable for being implemented on VLSI. Hence, to design an efficient and hardware-implementable algorithm is an important issue. This paper proposes a hardware-implementable and efficient systolic algorithm for the longest common subsequence problem. At first, we investigate the property of overlapped-contours. By employing "divide and conquer" technique, we can rind a longest common subsequence from the overlapped-contours. It needs a little more time complexity with only linear memory-space in total. Furthermore, the number of physical processing elements can be fixed by employing the concept of virtual processing elements. Therefore, our proposed systolic algorithm is not only hardware-implementable on VLSI but also efficient.
暂无评论