Adaptive filters are used in many applications of digital signalprocessing. Digital communications and digital video broadcasting are just two examples. The GSFAP algorithm, discussed in the paper, is characterized b...
详细信息
ISBN:
(纸本)9781424403820
Adaptive filters are used in many applications of digital signalprocessing. Digital communications and digital video broadcasting are just two examples. The GSFAP algorithm, discussed in the paper, is characterized by convergence superior to the popular NLMS, with only slightly higher complexity. The paper deals with floating-point-like implementation of algorithm using FPGA hardware. We present an optimized core for the GSFAP, built using logarithmic arithmetic which provides very low cost multiplication and division. The design is crafted to make efficient use of the pipelined logarithmic addition units. The resulting GSFAP core can be clocked at more than 80 MHz on the one million gate Xilinx XC2V1000-4 device. It can be used to implement filters of orders 20 to 1000 with a sampling rate exceeding 50 kHz. For comparison, we implemented a similar NLMS core and found that although it is slightly smaller than the GSFAP core and it allows a higher signal sampling rate (around 70 kHz) for the corresponding filter orders, GSFAP has adaptation properties that are much superior to NLMS, and that our core can provide very sophisticated adaptive filtering capabilities for resource-constrained embedded systems.
A new class of high-speed link receiver architecture that operates at a fraction of the on-chip clock frequency is proposed. Instead of time-interleaving multiple ADCs as is conventionally done, the received signal is...
详细信息
ISBN:
(纸本)0780377958
A new class of high-speed link receiver architecture that operates at a fraction of the on-chip clock frequency is proposed. Instead of time-interleaving multiple ADCs as is conventionally done, the received signal is channelized into multiple frequency subbands using a bank of mixers and lowpass filters. The proposed receiver architecture enjoys numerous implementation advantages. Based on the frequency channelized signals, an adaptive synchronization/detection scheme is described. An adaptive solution is necessary since the propagation channel and the analog analysis filters are generally not perfectly known and the symbol rate is incommensurate with the free running ADC sampling rate.
ieee organization defined a standard for floating-point arithmetic, used by processingsystems, in its directive 754 [1]. This directive encodes floating point numbers using a maximum of 64 bit: 23 bit of fractional a...
详细信息
ISBN:
(纸本)0780377958
ieee organization defined a standard for floating-point arithmetic, used by processingsystems, in its directive 754 [1]. This directive encodes floating point numbers using a maximum of 64 bit: 23 bit of fractional as single precision format and 52 bit of fractional as double precision format. The new multimedia terminals require low-power applications;the most important floating-point units (adders and multipliers) represent a significant part of total power wasted by a modem System-On-Chip. They might dissipate less power, using a reduced format representation. To verify this possibility, real systems simulate floating - point operations using different formats. In this conference paper, multimedia systems operate in different scenarios: wireless communication and image manipulation.
In this paper, an area-efficient JPEG 2000 codec is implemented. on 6.1 mm(2) with 0.18 mu m CMOS technology dissipating 180 mW at 1.8 V and 60 MHz. It is capable or processing 78 MS/s for lossy coding at 1 bpp and 50...
详细信息
ISBN:
(纸本)9781424403820
In this paper, an area-efficient JPEG 2000 codec is implemented. on 6.1 mm(2) with 0.18 mu m CMOS technology dissipating 180 mW at 1.8 V and 60 MHz. It is capable or processing 78 MS/s for lossy coding at 1 bpp and 50 MS/s for lossless coding. Four techniques are used to implement this chip. The pre-compression rate-distortion optimization (pre-RDO) determine truncation points before coding to reduce computations for the EDC. The dataflow conversion and embedded compression reduces the tile memory bandwidth. The bit-plane parallel context formation enables scalable bit-plane coding. Experimental results shows this chip has higher area efficiency than the previous works.
The synthesis and mapping of user designs to configurable hardware is typically performed by heuristics. These approaches analyze the decomposability of the combinational user functions as a starting point and derive ...
详细信息
ISBN:
(纸本)9781509033614
The synthesis and mapping of user designs to configurable hardware is typically performed by heuristics. These approaches analyze the decomposability of the combinational user functions as a starting point and derive appropriate mappings to LUT structures or pull them in from pre-computed implementation libraries. In everyday use, this generally achieves a very competitive trade-off between the time spent for the synthesis and the quality of the produced implementations. A higher pressure on the optimality of an implementation exists when implementation libraries are generated or when critical kernels that are extensively duplicated in a massively parallel design are implemented. In these cases, a formal statement that an implementation within fewer LUTs or a smaller combinational depth is strictly impossible is very valuable. We present a tool that formulates the task of mapping a user design to a configurable hardware structure as a quantified boolean formula (QBF). It then uses a QBF solver to either compute an implementing configuration or to know for sure that the desired functionality cannot be implemented within the provided hardware. In the context of this tool, we also describe different approaches to model configurable interconnects and present the impact of these modelling approaches on the time needed to solve the mapping tasks.
In this paper we propose three algorithms for low bit-rate (LBR) video transmission, namely Simple Dynamic Profiling (Simple DP), Minimum Dynamic Profiling (Minimum DP) and Mean Dynamic Profiling (Mean DP). These tech...
详细信息
In this paper we propose three algorithms for low bit-rate (LBR) video transmission, namely Simple Dynamic Profiling (Simple DP), Minimum Dynamic Profiling (Minimum DP) and Mean Dynamic Profiling (Mean DP). These techniques can be used in conjunction with other low bit rate techniques. Many of the techniques available for low bit rate video applications will either require lot of hardware resources or take advantage of the powerful computer platform that is used for their software implementation. On the other hand, the proposed algorithms are devised targeting hardware implementation for systems with limited hardware resources. When compared to coarse quantization technique, the new techniques not only achieve better compression ratio (twice that of the coarse quantization), but also result in better PSNR results (27 db for the coarse quantization versus 33 to 36 db for the new techniques).
In a cost sensitive consumer electronics market, the most cost-efficient implementation of the digital satellite set-top box is required in order to be competitive. The system functionality and the implementation rati...
详细信息
ISBN:
(纸本)0780338065
In a cost sensitive consumer electronics market, the most cost-efficient implementation of the digital satellite set-top box is required in order to be competitive. The system functionality and the implementation rational of the set-top box is described. Tradeoffs regarding hardware vs. software implementation are examined. Examples of how the system design affects the size and bandwidth of the memory requirements are given. Although the current generation of processors are capable of implementing real-time video decoding, they still are not a cost-effective solution for the digital set-top box.
Filter bank processing techniques based on MDCT/IMDCT have been widely adopted in various audio codec standards. Most published IMDCT computing algorithms focus mainly on either the reduction of computing complexity b...
详细信息
ISBN:
(纸本)0780377958
Filter bank processing techniques based on MDCT/IMDCT have been widely adopted in various audio codec standards. Most published IMDCT computing algorithms focus mainly on either the reduction of computing complexity but overlook the hardware realization issues, e.g. memory access complexity and the efficient mapping of computing kernel. In this paper, by exploiting the symmetric properties in computation, we first convert an N-point IMDCT to an N/2-point DCT-II problem. Fast DCT-II computing scheme is next derived and the overall scheme is further optimized to remove redundancy. Based on the proposed fast IMDCT computing scheme, a novel design mapping is developed to minimize memory access complexity without stalling the pipelined computation. The mapping features simple address generation, small temporary storage size and low access bandwidth. Performance analyses show that, given the same hardware resource allocation, the proposed design can outperform other well known IMDCT designs in terms of memory storage size, computing latency or fixed point implementation error.
Feedforward deep neural networks that employ multiple hidden layers show high performance in many applications, but they demand complex hardware for implementation. The hardware complexity can be much lowered by minim...
详细信息
ISBN:
(纸本)9781479965885
Feedforward deep neural networks that employ multiple hidden layers show high performance in many applications, but they demand complex hardware for implementation. The hardware complexity can be much lowered by minimizing the word-length of weights and signals, but direct quantization for fixed-point network design does not yield good results. We optimize the fixed-point design by employing backpropagation based retraining. The designed fixed-point networks with ternary weights (+1, 0, and -1) and 3-bit signal show only negligible performance loss when compared to the floating-point counterparts. The backpropagation for retraining uses quantized weights and fixed-point signal to compute the output, but utilizes high precision values for adapting the networks. A character recognition and a phoneme recognition examples are presented.
In this paper, robust timing & frequency synchronization techniques for OFDMA (OFDM-FDMA) systems is presented. Under the multi-path channel environment of ITU-R M.1225, Detection Probability, False Alarm, Missing...
详细信息
ISBN:
(纸本)0780393333
In this paper, robust timing & frequency synchronization techniques for OFDMA (OFDM-FDMA) systems is presented. Under the multi-path channel environment of ITU-R M.1225, Detection Probability, False Alarm, Missing Probabifity, and Mean Acquisition Time of the proposed timing synchronization scheme are compared with the existing method of 141 to demonstrate the excellence of the proposed scheme. MSE (Mean Square Error) and signal constellation to show the performance of carrier frequency offset estimation is also addressed in this paper.
暂无评论