Numerous applications require an ever increasing computational power, which can hardly be provided by classical sequential computers. Many of the algorithms exhibit a large amount of parallelism. Current vector comput...
详细信息
ISBN:
(纸本)0444701044
Numerous applications require an ever increasing computational power, which can hardly be provided by classical sequential computers. Many of the algorithms exhibit a large amount of parallelism. Current vector computers demonstrate the validity of this parallelization scheme but they also demonstrate the need for more flexible parallel processing. OPSILA is an investigation of parallel architecture which mixes two processing.modes: the vector SIMD mode and the parallel Single Program Multiple Data mode.
Three-dimensional display of moving images greatly enhances realism and adds a unique sense of 'presence'. Three-dimensional video systems have been kept from widespread application by two technical problems: ...
详细信息
Three-dimensional display of moving images greatly enhances realism and adds a unique sense of 'presence'. Three-dimensional video systems have been kept from widespread application by two technical problems: the need for glasses, viewing hoods, or other cumbersome devices for image steering;and the high bandwidths needed for transmission. Devices that avoid the discomfort of headgear by using autostereoscopic (pseudoholographic) displays are known, but these methods require even higher bandwidths to be effective. The author introduces digital predictive coding as a means of data compression for the transmission or storage of a set of spatially related images needed for an autostereoscopic display (interframe coding with frame memories). The algorithms, implementations, and application of a new predictor called a disparity corrected predictor (DCP) are described.
Warp is a programmable systolic array machine. The first large-scale version of the machine with an array of 10 linearly connected cells become operational in January 1986. Each cell in the array is capable of perform...
详细信息
Warp is a programmable systolic array machine. The first large-scale version of the machine with an array of 10 linearly connected cells become operational in January 1986. Each cell in the array is capable of performing 10 million 32-bit floating-point operations per second (10 MFLOPS). The 10-cell array can achieve a performance of 50 to 100 MFLOPS for a large variety of signal processing.operations such as digital filtering, image compression, and spectral decomposition. The machine, augmented by a boundary processor, is particularly effective for computationally expensive matrix algorithms such as solution of linear systems, QR-decomposition and singular value decomposition, that are crucial to many real-time signal processing.tasks. The authors outline the Warp implementation of the 2-dimensional discrete cosine transform and singular value decomposition.
The single-chip, high-performance signal processor design described in this paper departs from existing processor designs both in the way it is organized and the manner in which it performs computations. Major emphasi...
详细信息
Two massively parallel processing.architectures are presented which are suitable for solving a wide variety of divide-and-conquer type algorithms for problems such as the discrete Fourier transform, production systems...
详细信息
ISBN:
(纸本)0818607246
Two massively parallel processing.architectures are presented which are suitable for solving a wide variety of divide-and-conquer type algorithms for problems such as the discrete Fourier transform, production systems, design automation and others. The first architecture, called the Chain-structured Butterfly architecture (CBAR), consists of a two-dimensional array of N equals L(log//2 (L) plus 1) processing.elements (PE) organized as L levels of log//2 (L) plus 1 stages, and has the butterfly connection between PEs in consecutive stages with straight-through feedback between PEs in the last and first stages. This connection system has the desirable property of allowing thousands of PEs to be connected with O(N) connection cost, O(log//2 (N/log//2 N)) communication paths and a small number (4) of I/O ports per PE. However, this architecture is not fault-tolerant. The authors, therefore, propose a second architecture, called the REconfigurable Chain-structured Butterfly architecture (RECBAR), which possesses all the desirable features of the CBAR, with the number of I/O ports per PE increased to six, and uses O(log//2 N/N) overhead in PEs and approximately 50% overhead in links to achieve single-level fault tolerance. The reliability improvements offered by the RECBAR are examined.
作者:
Pachowicz, P.W.Institute for Control
Systems Engineering and Telecommunication Academy of Mining and Metallurgy al.Mickiewicza 30 Cracow30-059 Poland
The idea of a co-processor to process distributed or marked local image data is described, The architecture, format of instruction and time effects for some algorithms are pointed out. Time effects for image processin...
详细信息
An interconnection scheme based on a bus network consisting of high-speed time-sliced buses and interbus links of matching bandwidth is described. Simulation results and two contrasting approaches to simulating such a...
详细信息
ISBN:
(纸本)0818607491
An interconnection scheme based on a bus network consisting of high-speed time-sliced buses and interbus links of matching bandwidth is described. Simulation results and two contrasting approaches to simulating such a machine are discussed. The network is best applied to problems commonly found in real-time processing.which exhibit locality in their communication patterns (e. g. , imageprocessing..
In this paper we study several parallel machines which have been built since 1972 and describe a parallel cellular machine;each cell owns function hardware which can be reconfigured by setting firmware within the cell...
详细信息
In this paper we study several parallel machines which have been built since 1972 and describe a parallel cellular machine;each cell owns function hardware which can be reconfigured by setting firmware within the cells to form fast parallel computation to process some computer vision and AI problem. It is proposed with some imageprocessing.operations mapping into cells.
Circuit-switched interconnection networks for resource sharing in multiprocessors, named resource-sharing interconnection networks, are studied. Resource scheduling in systems with such an interconnection network enta...
详细信息
ISBN:
(纸本)0818607246
Circuit-switched interconnection networks for resource sharing in multiprocessors, named resource-sharing interconnection networks, are studied. Resource scheduling in systems with such an interconnection network entails the efficient search of a mapping from requesting processors to free resources so that circuit blockages in the network are minimized and resources are maximally used. The optimal mapping is obtained by transforming the scheduling problems into various network-flow problems for which existing algorithms can be applied. A distributed architecture to realize a maximum-flow algorithm using token propagations is also described. The method is applicable to any general network configuration modeled as a digraph in which the requesting processors and free resources can be partitioned into two disjoint subsets.
The authors examine the applicability of fine-grained tree-structured SIMD machines, which are amenable to highly efficient VLSI implementation, to image correlation which is representative of image window-based opera...
详细信息
ISBN:
(纸本)0818607211
The authors examine the applicability of fine-grained tree-structured SIMD machines, which are amenable to highly efficient VLSI implementation, to image correlation which is representative of image window-based operations. Several algorithms are presented for image shifting and correlation. A particular massively parallel machine called NONVON is used for purposes of explication and performance evaluation, but only its tree-structured communication capabilities and its SIMD mode of execution are considered. Novel algorithmic techniques are described, such as vertical pipelining, subproblem partitioning, associative matching, and data duplication, that effectively exploit the massive parallelism available in fine-grained SIMD tree machines while avoiding communication bottlenecks. Simulation results are presented and relative advantages and limitations of the class of machines under consideration are outlined.
暂无评论