The memory-based processor array (MPA) was previously designed as an effective memory-processor integrated architecture. The MPA can be easily attached into any host system via memory interface. In this paper, the imp...
详细信息
The memory-based processor array (MPA) was previously designed as an effective memory-processor integrated architecture. The MPA can be easily attached into any host system via memory interface. In this paper, the impact of the memory interface structure is analytically analyzed for computer vision tasks. An analytical model is constructed to describe the characteristics of the memory interface structure. Performance improvement for the memory interface model of the MPA system can be 6-40% for vision tasks consisting of sequential and data parallel tasks. Mapping algorithms to implement convolution and connected component labeling on the MPA are also presented. The asymptotic time complexities of the algorithms are evaluated to verify the cost-effectiveness and the efficiency of the MPA system. (C) 2000 Elsevier Science B.V. All rights reserved.
Electromagnetic field analysis is a time-consuming process, and a method involving the use of an FPGA accelerator is one of the attractive ways to accelerate the analysis;the other method involve the use of CPU and GP...
详细信息
Electromagnetic field analysis is a time-consuming process, and a method involving the use of an FPGA accelerator is one of the attractive ways to accelerate the analysis;the other method involve the use of CPU and GPU. In this paper, we propose an FPGA accelerator dedicated for a two-dimensional finite-difference time-domain (FDTD) method. This accelerator is based on a two-dimensional single instruction multiple data (simd) array architecture. Each processing element (PE) is composed of a six-stage pipeline that is optimized for the FDTD method. Moreover, driving signal generation and impedance termination are also implemented in the hardware. We demonstrate that our accelerator is 11 times faster than existing FPGA accelerators and 9 times faster than parallel computing on the NVIDIA Tesla C2075. As an application of the high-speed FDTD accelerator, the design optimization of a waveguide is shown.
Let Ω denote the set of permutations performable by a pass through the Omega network (an n-stage shuffle-exchange network), and let π denote an arbitrary permutation of N = 2n elements. In this paper we characterize...
详细信息
Let Ω denote the set of permutations performable by a pass through the Omega network (an n-stage shuffle-exchange network), and let π denote an arbitrary permutation of N = 2n elements. In this paper we characterize the sets {π|πΩ = Ω} (called left invariants) and {π|Ωπ = Ω} (called right invariants). This characterization provides a useful criterion for determining whether or not the composition of two permutations belongs to Ω; it also allows a number of known results to be generalized and their derivations simplified.
The distance thresholding method is introduced as a means of reducing, power dissipation for self-organising map (SOM)/learning vector quantisation (LVQ) calculations using a single instruction multiple data (simd) ar...
详细信息
The distance thresholding method is introduced as a means of reducing, power dissipation for self-organising map (SOM)/learning vector quantisation (LVQ) calculations using a single instruction multiple data (simd) array. The method requires little additional hardware support. For three standard SOM/LVQ benchmarks power reductions are approximately 45%.
This work presents a novel parallel technique to implement stack morphological filters for image processing. The method relies on applying the image bitwise decomposition to manipulate the grayscale image at a bit-pla...
详细信息
ISBN:
(纸本)9781424479948
This work presents a novel parallel technique to implement stack morphological filters for image processing. The method relies on applying the image bitwise decomposition to manipulate the grayscale image at a bit-plane level, while simple logical operations and Positive Boolean Functions (PBF's) are executed in parallel to derive the transformed bit-planes. The relationship between the bitwise and threshold decomposition is closely investigated and analysed, which lead us to derive an algorithm whose control flow is full binary encoded. Furthermore, the algorithm exhibits an interesting performance, which depends on the image histogram thanks to its hierarchical processing and the study of the relationship among binary decompositions.
A novel approach to implement gray-scale morphological operations is presented in this work. This new technique is based on the bitwise decomposition of the gray-scale image, yielding bitplanes disposed according to t...
详细信息
ISBN:
(纸本)9781424423538
A novel approach to implement gray-scale morphological operations is presented in this work. This new technique is based on the bitwise decomposition of the gray-scale image, yielding bitplanes disposed according to their bit of significance. It is of particular interest for implementations on Focal Plane Processors. Our approach relies on the binary search method to obtain either the maximum or minimum on a local neighborhood by manipulating the binary levels resulting from the bitwise decomposition with simple logic functions. This contrasts significantly with the classical Threshold Decomposition (TD) approach, on which most of the current techniques are based on. Our method shows better efficiency than TD implementations. Further gains can be obtained because our method shows a strong dependency on the image dynamic range.
暂无评论