The well-known advantages of pipelines systolic array architecture is applied for implementation of a second-order recursive digital filter. The proposed structure achieves five-fold increase in system throughput over...
详细信息
The well-known advantages of pipelines systolic array architecture is applied for implementation of a second-order recursive digital filter. The proposed structure achieves five-fold increase in system throughput over standard techniques, and two-fold increase over usual systolic approaches. In this letter, the data flow operation and the basic cell implementation for this design are presented.
This paper presents a novel FPGA implementation of a two dimensional (8x8) point Discrete Cosine Transform. It is shown how the development of a suitable architectural style can produce high quality circuit designs fo...
详细信息
ISBN:
(纸本)0780338065
This paper presents a novel FPGA implementation of a two dimensional (8x8) point Discrete Cosine Transform. It is shown how the development of a suitable architectural style can produce high quality circuit designs for a specific technology, in this case the Xilinx XC6200 series of FPGA. Distributed arithmetic and exploitation of parallelism and pipelining are used to produce a DCT implementation on a single FPGA that operates at 25 frames per second with VGA resolution which is the equivalent of 2 million multiplications or additions per second.
Based on the model of synchronous data flow (SDF) [13], so called single appearance schedules are known to provide memory-optimal schedules. Among these, the problem of buffer memory optimization is treated: (I) An Ev...
详细信息
ISBN:
(纸本)0780349970
Based on the model of synchronous data flow (SDF) [13], so called single appearance schedules are known to provide memory-optimal schedules. Among these, the problem of buffer memory optimization is treated: (I) An Evolutionary Algorithm (EA) is applied to efficiently explore the (in general) exponential search space of actor firing orders. (2) For each order, the buffer costs are evaluated by applying a dynamic programming post-optimization step (GDPPO).
Power and thermal characteristics have emerged as first-order design goals for all types of semiconductors, including embedded signal and information processingsystems. This paper surveys the basic physical principle...
详细信息
ISBN:
(纸本)9781509033614
Power and thermal characteristics have emerged as first-order design goals for all types of semiconductors, including embedded signal and information processingsystems. This paper surveys the basic physical principles of power and thermal behavior and argues that statistical models, such as Markov decision processes, are well-suited to the management of power and thermal behavior at both design time and run time.
Many signalprocessing applications are computationally intensive, and can not be implemented on a single processor. The concept of automatic parallel implementation of such applications on multiple connected processo...
详细信息
ISBN:
(纸本)0780338065
Many signalprocessing applications are computationally intensive, and can not be implemented on a single processor. The concept of automatic parallel implementation of such applications on multiple connected processors has attracted attention in recent years. We present in this paper a new design environment, Taurus, which allows for automatic parallel mapping of DSP algorithms and applications onto multiprocessor hardware platforms. The front end to Taurus is a standard commercially available software package, and the interconnection topology and the processor specification are user definable. Both transputers and TI TMS320 C40 DSP chips have been used. The system architecture, special features and details of the building blocks are presented and discussed.
High-end video and multimedia processing applications today require huge amounts of memory. For cost reasons, the usage of conventional dynamic RAM (SDRAM) is preferred. However, accessing SDRAM is a complex task, esp...
详细信息
ISBN:
(纸本)0780377958
High-end video and multimedia processing applications today require huge amounts of memory. For cost reasons, the usage of conventional dynamic RAM (SDRAM) is preferred. However, accessing SDRAM is a complex task, especially if multi-stream access, different stream types and realtime capability are an issue. This paper describes a multi-stream SDRAM controller IP that covers different stream types and applies memory scheduling to achieve high bandwidth utilization. Two different architectures are presented and discussed, simulation results with a realistic application configuration demonstrate up to 90% of maximum memory bandwidth utilization. The scheduler IP is suitable for FPGA implementation and is flexible enough to be used in other applications.
Software synthesis is an increasingly important problem in the design of digital signalprocessingsystems, since multimedia systems in particular are increasingly implemented using a combination of programmable and h...
详细信息
ISBN:
(纸本)0780338065
Software synthesis is an increasingly important problem in the design of digital signalprocessingsystems, since multimedia systems in particular are increasingly implemented using a combination of programmable and hardwired processing elements, This paper describes new algorithms for system-level software synthesis, namely the scheduling and allocation of a set of tasks executing on a heterogeneous multiprocessor, The algorithm is hierarchical: it takes advantage of the hierarchical structure of the system's task graph to hierarchically allocate and schedule processes on the multiprocessor to meet the hard real-time constraints on the tasks. Our algorithm takes into account of both the on-chip and off-chip data communications and the behaviors of data cache memories. It can significantly improve task schedulability by carefully scheduling processes and data transfers to minimize data cache misses and off-chip data accesses.
A systolic architecture is described for computing the I-D discrete Fourier transform, which provides a significant reduction in array area by reducing the number of complex multipliers compared to previous systolic a...
详细信息
ISBN:
(纸本)0780375874
A systolic architecture is described for computing the I-D discrete Fourier transform, which provides a significant reduction in array area by reducing the number of complex multipliers compared to previous systolic approaches. This design improvement is achieved by taking advantage of a more efficient computation scheme based on symmetries in the coefficient matrix and a radix-4 butterfly. Comparisons are provided with previous systolic architectures. Systolic architecture designs were created using a CAD tool able to find optimal non-uniform array designs starting from high-level coded descriptions of the algorithm.
When designing digital hardware for a given signalprocessing algorithm, one must ensure that signals arrive at the appropriate points of the implementation at the right times, so that they are operated upon as specif...
详细信息
When designing digital hardware for a given signalprocessing algorithm, one must ensure that signals arrive at the appropriate points of the implementation at the right times, so that they are operated upon as specified in the algorithm. It is easy to err even for systems of moderate size and complexity, and a single error could cause complete garbage to be output. This problem is particularly severe for bit-serial designs. Also, having obtained a correct implementation, it is not always easy to see how to improve it. In this paper a systematic technique is presented to derive correct schedules for a synchronous digital system, given a signal flow graph for an algorithm. It is also shown how to use this technique to derive designs that are optimal in having the lowest latency, the highest throughput, or the smallest number of registers. The same technique can also be used to verify digital systems that have already been designed.
Chip level equalization has been proved as one of the key enabling technologies for HSDPA (High Speed Downlink Packet Access) receiver. Although many complicated algorithms (Kalman, etc.) have been reported to have gr...
详细信息
ISBN:
(纸本)9781424403820
Chip level equalization has been proved as one of the key enabling technologies for HSDPA (High Speed Downlink Packet Access) receiver. Although many complicated algorithms (Kalman, etc.) have been reported to have great performance, their complexity and irregularity make it difficult to have efficient parallel software implementation. Targeting processor based SDR (Software Defined Radio) platform, our goal is to design a practical HSDPA chip level equalizer having implementation cost as low as NLMS but offering considerable performance improvement over NLMS. Our proposal is based on the observations of multiple domain sparseness in cellular channel. The first observation is that the Channel Input Response (CIR) has only a few significant taps. Although previous work exploits this for complexity reduction, we utilize it to improve the BER instead. The second observation is that the channel dynamics is not always significant, based on which we propose a feedback-control based technique to make the equalizer aware of the variation of channel dynamics. In addition, the equalizer becomes scalable in terms of Quality-Cost. By exploiting both of the aforementioned sparseness, the proposed HSDPA chip equalizer can significantly lower the BER error floor introduced by channel dynamics, so that more than 5 dB SNR gain can be achieved with the same implementation cost (by scalability) as NLMS. The design is demonstrated on TI TMS320C6711.
暂无评论