Conceptually, an SIMD machine has the capability to overlap operation of the control unit (CU) with the operation of the processing elements (PEs). Computation of a single program is most efficient when the work load ...
详细信息
Conceptually, an SIMD machine has the capability to overlap operation of the control unit (CU) with the operation of the processing elements (PEs). Computation of a single program is most efficient when the work load of the CU and the PEs is balanced. Load balancing between the CU and PEs is accomplished by migrating certain computations (e.g., PE-common array index calculations, loop index variable manipulation) from the PEs to the CU and vice versa. The goal of this research is to develop some of the techniques needed for the automatic specification of CU/PE overlap at compile time. As a proof of concept, the ELP compiler has been modified to support experimentation with CU/PE overlap.< >
This paper is concerned with reducing the rank of the adaptive weight vector in radar array signal processing. The motivation for reducing the rank is that modern space-time processing requires many more weights than ...
详细信息
Very-high-order FIR filters required for the new modulation schemes associated with wireless computer networks and cellular telephones can be implemented in VLSI circuitry using low-power CMOS technology and a novel a...
详细信息
Very-high-order FIR filters required for the new modulation schemes associated with wireless computer networks and cellular telephones can be implemented in VLSI circuitry using low-power CMOS technology and a novel application of Residue Number System (RNS) arithmetic. Through this approach 20-bit equivalent integer arithmetic can be obtained for filters with 8 to 256 taps with only a modest increase in hardware for filters above 8 taps. Simulations indicate that this new technique can increase dramatically the number of taps implemented on a single VLSI chip when compared with an FIR filter generated using FIRGEN.
An algorithm based on the subband nonuniform discrete Fourier transform (SB-NDFT) is proposed for decoding dual-tone multi-frequency (DTMF) signals. To decode a DTMF signal, its energy at the eight DTMF frequencies mu...
详细信息
An algorithm based on the subband nonuniform discrete Fourier transform (SB-NDFT) is proposed for decoding dual-tone multi-frequency (DTMF) signals. To decode a DTMF signal, its energy at the eight DTMF frequencies must be determined by evaluating samples of the NDFT at these frequencies. In the proposed SB-NDFT algorithm, these NDFT samples are computed by decomposing the input signal into two subbands. Since DTMF signals occupy the low-frequency part of the telephone bandwidth, the higher subband can be discarded for a fast, approximate computation. A performance comparison between algorithms based on the NDFT, SB-NDFT, DFT, and SB-DFT shows that the SB-NDFT requires the lowest number of computations to attain a specified level of performance.
Motion vector field (MVF) prediction methods are presented followed by a restoration method. These methods combined with a proposed motion compensated (MC) video coding scheme are suitable for low bit rate transmissio...
详细信息
Motion vector field (MVF) prediction methods are presented followed by a restoration method. These methods combined with a proposed motion compensated (MC) video coding scheme are suitable for low bit rate transmission. An expression is derived for the initial estimate of the working MVF based on the preceding MVF. Spatio-temporally adaptive regularization is applied using neighborhood information. The output MVF is used as the initial prediction estimate for a Kalman MVF restoration approach. By applying this method to both the encoder and decoder, the resulting MC MVF and image intensity temporal updates are coded and transmitted. The restoration method produces accurate estimates of the MVF, thus resulting in a significant transmission cost reduction. Experiments with standard video-conference image sequences demonstrate the improved performance of the proposed scheme.
This paper presents two simple, accurate and efficient delay models, the static delay model and the dynamic delay model, to support performance optimization of VLSI Sea-of-Wires Arrays (SWA). The SWA delay model treat...
详细信息
ISBN:
(纸本)9780818670398
This paper presents two simple, accurate and efficient delay models, the static delay model and the dynamic delay model, to support performance optimization of VLSI Sea-of-Wires Arrays (SWA). The SWA delay model treats each distributed gate as an attribute-based primitive gate with different internal and external connection wires. Instead of solving differential equations, the SWA model determines delays by lookup from a multi-dimensional table. Only a few microseconds of execution time are needed per gate. The propagation delay along a circuit path is the sum of the delay segments of distributed gates in the path. The critical path of an SWA design can be identified with an O(n) timing analysis algorithm. For most AHPL Benchmarks, the table-lookup method achieves 5 orders of magnitude speedup over SPICE for the same circuits with error margin less than 7%.< >
Temporal frame interpolation techniques are presented, based on an object-based algorithm for 3-D motion estimation. This algorithm uses a joint estimation-segmentation scheme to minimize the displaced frame differenc...
详细信息
Temporal frame interpolation techniques are presented, based on an object-based algorithm for 3-D motion estimation. This algorithm uses a joint estimation-segmentation scheme to minimize the displaced frame difference between a frame and its motion compensated prediction from the previous frame. Depth information is estimated beforehand from each stereo pair. Special attention is paid to the exploitation of occlusion information so as to improve the reconstruction quality of the interpolated frames. Experimental results are used to evaluate the performance of the proposed methods and to compare with more conventional frame interpolation techniques.
Separation by plasma implantation of oxygen (SPIMOX) is a novel method for fabricating silicon-on-insulator (SOI) wafers. This method uses plasma immersion ion implantation (PIII) where the desired voltage of implant ...
详细信息
Separation by plasma implantation of oxygen (SPIMOX) is a novel method for fabricating silicon-on-insulator (SOI) wafers. This method uses plasma immersion ion implantation (PIII) where the desired voltage of implant is applied to a wafer immersed in a plasma. SPIMOX is particularly suited for thin separation by implantation of oxygen (SIMOX) wafer fabrication. High implantation rates can be achieved in SPIMOX. A dose of nearly 10/sup 18/ cm/sup -2/ with an implant current density of 1 mA cm/sup -2/ can be achieved in 3 minutes of implantation time. The short implantation time and the simplicity of the implantation equipment makes it a potentially more economical method for fabricating SIMOX wafers. Moreover, the theoretical time for implantation remains constant in SPIMOX with increase in wafer size.
A technique for dynamic real time focusing of ultrasonic transducer arrays is introduced. This paper assumes a linear transducer array operating in the sequential mode. In this case, the elements are fired sequentiall...
详细信息
A technique for dynamic real time focusing of ultrasonic transducer arrays is introduced. This paper assumes a linear transducer array operating in the sequential mode. In this case, the elements are fired sequentially, one at a time, all elements receive in parallel. The returns are integrated coherently on the basis of equal geometric phase to synthesize the echoes of an array twice as large. Dynamic focusing is accomplished at any depth through electronic correction of the dominant quadratic phase terms. For this purpose a second order Taylor series expansion of the phase is used.< >
Reducing the size of large address traces by "filtering" them through a small direct-mapped cache is a useful technique for making more efficient use of both secondary storage and processors used for trace-d...
详细信息
Reducing the size of large address traces by "filtering" them through a small direct-mapped cache is a useful technique for making more efficient use of both secondary storage and processors used for trace-driven simulation. However, when filtered traces are used to drive cache simulators, the distortion introduced by such filtering can produce substantial errors in the miss ratios obtained, despite earlier reports to the contrary. We present the results of a systematic study of such errors, including a model for compensating for some errors.< >
暂无评论