Associative processing based on content-addressable memories has been argued to be the natural solution for non-numerical information processing applications. Unfortunately, the implementation requirements of these ar...
详细信息
Associative processing based on content-addressable memories has been argued to be the natural solution for non-numerical information processing applications. Unfortunately, the implementation requirements of these architectures using conventional electronic technology have been very cost prohibitive, and therefore associative processors have not been realized. Optics has the capability over electronics for directly supporting associative processing by providing economic and efficient interconnects, massive parallelism, and high-speed processing. This paper presents the principles of designing an optical content-addressable parallel processor called OCAPP, for the efficient support of parallel symbolic computing. The architecture is designed to fully exploit optics advantages in interconnects and high-speed operations, and is potentially very suitable for applications where the number of data sets to be operated on is high. Several search and retrieval algorithms are mapped onto OCAPP to illustrate its ability to support parallel symbolic computing.< >
This paper discusses detection of unstable predicates in a distributed program. Some applications of this are in program debugging and testing. The authors provide a predicate logic in the form of a grammar giving the...
详细信息
This paper discusses detection of unstable predicates in a distributed program. Some applications of this are in program debugging and testing. The authors provide a predicate logic in the form of a grammar giving the rewrite rules for constructing predicates about a distributed program. This predicate logic is general enough to describe many conditions programmers are interested in. These predicates may depend on the global state of the system and may be unstable in nature. However, the detection algorithms given guarantee to detect such predicates should one ever become true. The main results of this work are a logic for expressing unstable distributed predicates and detection methods for certain unstable distributed predicates.< >
The problem of generating test pattern sequences for digital circuits is a crucial one in the area of electronic CAD. While a significant effort has been made to develop new and more powerful algorithms to solve it, t...
详细信息
The problem of generating test pattern sequences for digital circuits is a crucial one in the area of electronic CAD. While a significant effort has been made to develop new and more powerful algorithms to solve it, the required CPU times are still unacceptable in many cases. The authors propose a different approach based on the use of a general-purpose parallel architecture. The attention is devoted to combinational circuits described at the gate level, and faults are modeled as permanent single stuck-ats. parallelization strategies are discussed, together with an implementation on a transputer-based machine. The resulting system significantly speeds-up the test generation process: its performance is discussed, also reporting the experimental results obtained on the standard set of bench-mark combinational circuits.< >
The Viterbi-Algorithm (VA) is a common application of dynamic programming. The algorithm contains a nonlinear feedback loop (ACS-feedback, ACS: add-compare-select) which is the bottleneck in high data rate implementat...
详细信息
The Viterbi-Algorithm (VA) is a common application of dynamic programming. The algorithm contains a nonlinear feedback loop (ACS-feedback, ACS: add-compare-select) which is the bottleneck in high data rate implementations. In this paper we show that, asymptotically, the ACS-feedback no longer has to be processed recursively, i.e., there is no feedback. With only negligible performance loss, this fact can be exploited technically to design efficient and purely feedforward architectures for Viterbi decoding that have a modular extendable structure. By designing one cascadable module, any speedup can be achieved simply by adding modules to the implementation. It is shown that optimization criteria, as minimum latency or maximum hardware efficiency, are met by very different architectures.
This paper deals with the parallel execution of algorithms with global and/or irregular data dependencies on a regularly and locally connected processor array. The associated communication problems are solved by the u...
详细信息
This paper deals with the parallel execution of algorithms with global and/or irregular data dependencies on a regularly and locally connected processor array. The associated communication problems are solved by the use of a two-dimensional sorting algorithm. The proposed architecture, which is based on a two-dimensional sorting network, offers a high degree of flexibility and allows an efficient mapping of many irregularly structured algorithms. In this architecture a one-dimensional processor array performs all required control and arithmetic operations, whereas the sorter solves complex data transfer problems. The storage capability of the sorting network is also used as memory for data elements. The algorithms for sparse matrix computations, fast Fourier transformation and for the convex hull problem, which are mapped onto this architecture, as well as the simulation of a shared-memory computer show that the utilization of the most complex components, the processors, is O(1).
The authors give an algorithm for the real-time computation of the running order statistic (ROS). Order statistic filters (OSFs) or ranked order filters (ROFs) are a class of nonlinear digital filters which have found...
详细信息
The authors give an algorithm for the real-time computation of the running order statistic (ROS). Order statistic filters (OSFs) or ranked order filters (ROFs) are a class of nonlinear digital filters which have found applications in image processing and in smoothing of speech data. Real-time implementation of an OSF requires the computation of the order statistic (ranked order) of the samples in a window (usually containing an odd number of samples) which gets periodically updated with the arrival of a new sample(s). A modular and highly parallel architecture characterized by programmable window size and required rank order is proposed for the implementation of the algorithm in VLSI form.< >
ELSA project concerns massively parallelarchitectures on silicon dedicated especially to low-level image processing. Real-time low-level image processing demands a huge amount of computing power. Fortunately, the alg...
详细信息
ELSA project concerns massively parallelarchitectures on silicon dedicated especially to low-level image processing. Real-time low-level image processing demands a huge amount of computing power. Fortunately, the algorithms encountered in this field are naturally regular which suggests a regular architecture to solve them. One of the most efficient scheme is array processors. This array processor has been implemented on a whole wafer instead of implementing it in VLSI chips each containing a few processing elements. Potential advantages of wafer scale integration over conventional VLSI systems include lower power, higher speed and small volume. However, WSI suffers from low yield. Redundancy and reconfiguration techniques are used to enhance the overall yield. Both of these have been implemented in ELSA. Three software packages have been developed to completely (re)configure the wafer and build a working target array.< >
Yield analysis for reconfigurable structures is often difficult, due to the defect distribution and irregularity of reconfiguration algorithms. In this paper, the authors give a method to analyze the yield of reconfig...
详细信息
Yield analysis for reconfigurable structures is often difficult, due to the defect distribution and irregularity of reconfiguration algorithms. In this paper, the authors give a method to analyze the yield of reconfigurable pipelines for the following model: Given n pipelines with m stages, where each stage of a pipeline is defective with constant probability and spare wires are provided for reconfiguration, the authors calculate the expected percentage of pipelines they can harvest after reconfiguration. By modeling the 'shifting' reconfiguration as weighted chains in a lattice and applying poset theory, they give upper and lower bounds for the harvest rate as a function of m and n.< >
This conference proceeding contains 45 papers. The topics covered include: routing algorithms;array processing;message passing;fault tolerance;hypercubes;scheduling;parallel algorithms;memory sharing;computational com...
详细信息
ISBN:
(纸本)0897913701
This conference proceeding contains 45 papers. The topics covered include: routing algorithms;array processing;message passing;fault tolerance;hypercubes;scheduling;parallel algorithms;memory sharing;computational complexity;PRAMs;graph theory and data structures.
In this paper we give efficient parallel algorithms for a number of problems from computational geometry by using generalized versions of parallel plane sweeping. We illustrate our approach with a number of applicatio...
详细信息
ISBN:
(纸本)0897913701
In this paper we give efficient parallel algorithms for a number of problems from computational geometry by using generalized versions of parallel plane sweeping. We illustrate our approach with a number of applications, which include general hidden-surface elimination (even if the overlap relation contains cycles), CSG boundary evaluation, computing the contour of a collection of rectangles, and hidden-surface elimination for rectangles. Our algorithms are for the CREW PRAM.
暂无评论