Streaming SIMD Extensions (SSE) is a unique feature embedded in the Pentium III and P4 classes of microprocessors. By fully exploiting SSE, parallelalgorithms can be implemented on a standard personal computer and a ...
详细信息
Discovery of sequential patterns is becoming increasingly useful and essential in many scientific and commercial domains. Enormous sizes of available datasets and possibly large number of mined patterns demand efficie...
详细信息
We have designed and implemented a lightweight process (thread) library called "Lesser Bear" for SMP computers. Lesser Bear has high portability and thread-level parallelism. Lesser Bear executes threads in ...
详细信息
We have designed and implemented a lightweight process (thread) library called "Lesser Bear" for SMP computers. Lesser Bear has high portability and thread-level parallelism. Lesser Bear executes threads in parallel by creating UNIX processes as virtual processors and a memory-mapped file as a huge shared-memory space. To schedule thread in parallel, the shared-memory space has been divided into working spaces for each virtual processor, and a ready queue has been distributed. But the previous version of Lesser Bear sometimes requires a lock operation for dequeueing. We therefore proposed a scheduling mechanism that does not require a lock operation. To achieve this, each divided space forms a rotatory topology through the queue, and we use a lock-free algorithm for the queue operation. this mechanism is applied to Lesser Bear and evaluated by experimental results.
A new interpolation filtering architecture using block structure and look-up table (LUT) is proposed. Using parallelprocessing inherited from block structure, the filtering rate and the power consumption are lowered,...
详细信息
A new interpolation filtering architecture using block structure and look-up table (LUT) is proposed. Using parallelprocessing inherited from block structure, the filtering rate and the power consumption are lowered, which makes the proposed architecture appropriate for the modulator in a mobile communication system. Also, applying the symmetric property of filter coefficients, the LUT size and memory requirement are reduced. the proposed filter architecture is generalized by reconstructing the LUT. As a design result, comparison withthe prior LUT-based architectures showed that the proposed filter architecture is more area-efficient.
A novel architecture for real-time synthetic aperture radar signal processingthat achieves real-time processing by using a recently proposed signum coded algorithm and time domain processing, is presented. New archit...
详细信息
A novel architecture for real-time synthetic aperture radar signal processingthat achieves real-time processing by using a recently proposed signum coded algorithm and time domain processing, is presented. New architecture is completely parallel and can be dynamically reconfigured in order to use different dimensions of data and filter matrices. A standard-cell VLSI implementation is presented and its performances are evaluated through circuit simulations.
Molecular biologists frequently scan sequence databases to detect functional similarities between proteins. Even though efficient dynamic programming algorithms exist for the problem, the required scanning time is sti...
详细信息
Although FPGA technology offers the potential of designing high performance systems at low cost for a wide range of applications, its programming model is prohibitively low level requiring either a dedicated FPGA-expe...
详细信息
Although FPGA technology offers the potential of designing high performance systems at low cost for a wide range of applications, its programming model is prohibitively low level requiring either a dedicated FPGA-experienced programmer or basics digital design knowledge. To allow a signal/image processing end-user to benefit from these kind of devices, the level of design abstraction needs to be raised, even beyond a hardware description language level (e.g. VHDL). this approach will help the application developer to focus on signal/image processingalgorithms rather than on tow-level designs and implementations. this paper aims to present a framework for a wavelet-based high level environment. the proposed approach will help the end-user to generate FPGA configurations for DWT at a highest level without any knowledge of the low-level design styles and architectures.
In order to support a dual mode DECT/GSM terminal architecture, with low power characteristics and integrated support for direct conversion terminal architecture the basic parts of such a terminal were designed and im...
详细信息
In order to support a dual mode DECT/GSM terminal architecture, with low power characteristics and integrated support for direct conversion terminal architecture the basic parts of such a terminal were designed and implemented. these parts include a baseband processor and a modem. the baseband processor is designed to support dual mode operation, all baseband processing required and different terminal architectures (heterodyne or direct conversion). the modem features a GMSK/GFSK modulator and a novel, low power detection algorithm that supports direct conversion terminals. the architecture of the direct conversion wireless terminal is presented along with details on the low power characteristics of the processor and the modem. Experimental results from the operation of the terminal are also presented.
A systematic approach for the assignment of array type data structures to the layers of fixed memory hierarchies present in instruction set processors is presented. Memory Hierarchy Layer Assignment (MHLA) is required...
详细信息
A systematic approach for the assignment of array type data structures to the layers of fixed memory hierarchies present in instruction set processors is presented. Memory Hierarchy Layer Assignment (MHLA) is required to ensure the efficient exploitation of the data re-use present in multimedia type algorithms. Exploitation of data re-use through storage of the most frequently accessed and re-used data elements in the smaller levels of a processor's physical memory hierarchy leads to significant execution time and power consumption savings. the proposed methodology for Memory Hierarchy Layer Assignment takes into consideration architectural features of the target processors such as the fixed physical data memory hierarchy and the hardware control mechanisms of some of the levels (caches) of the memory hierarchy. Experimental results prove that exploitation of data re-use combined withthe proposed approach for Memory Hierarchy Layer Assignment leads to significant power consumption and performance gains.
In order to fulfil real time signal processing tasks such as clutter rejection, moving target detection (MTD) and constant false alarm rate (CFAR) control in airborne radar, an airborne radar parallel signal processin...
详细信息
ISBN:
(纸本)0780370007
In order to fulfil real time signal processing tasks such as clutter rejection, moving target detection (MTD) and constant false alarm rate (CFAR) control in airborne radar, an airborne radar parallel signal processing system (ARPS2) is proposed with DSP chips as its kernel processing nodes. the DSP chips are used withparallel architecture. Each node has its private input and output memory. It adopts several parallel techniques, such as parallel storage, parallelprocessing, parallel code loading and parallel data organization to achieve high efficiency. It has a simple structure, excellent flexibility and easiness in developing. ARPS2 is going to be applied to an airborne radar. It can also be applied to perform high-speed real time signal processingalgorithms in other kinds of radar.
暂无评论