A multiplier-free baseband filter is proposed for the low power VLSI implementation in a Code Division Multiple Access (CDMA) system. The new computational efficient filter structure is based on a novel prefilter stru...
详细信息
ISBN:
(纸本)0780377958
A multiplier-free baseband filter is proposed for the low power VLSI implementation in a Code Division Multiple Access (CDMA) system. The new computational efficient filter structure is based on a novel prefilter structure involving a pair of even and odd length FIR filter with same bandedges. It is shown by means of example that the new structure not only achieves 45.8% savings in the number of multipliers, but also reduces the word length requirement for the coefficients of an IS-95 CDMA baseband filter. The VLSI implementation shows that the new structure reduces both the chip area and power consumption considerably compared with the direct-form implementation.
A new class of high-speed link receiver architecture that operates at a fraction of the on-chip clock frequency is proposed. Instead of time-interleaving multiple ADCs as is conventionally done, the received signal is...
详细信息
ISBN:
(纸本)0780377958
A new class of high-speed link receiver architecture that operates at a fraction of the on-chip clock frequency is proposed. Instead of time-interleaving multiple ADCs as is conventionally done, the received signal is channelized into multiple frequency subbands using a bank of mixers and lowpass filters. The proposed receiver architecture enjoys numerous implementation advantages. Based on the frequency channelized signals, an adaptive synchronization/detection scheme is described. An adaptive solution is necessary since the propagation channel and the analog analysis filters are generally not perfectly known and the symbol rate is incommensurate with the free running ADC sampling rate.
ieee organization defined a standard for floating-point arithmetic, used by processingsystems, in its directive 754 [1]. This directive encodes floating point numbers using a maximum of 64 bit: 23 bit of fractional a...
详细信息
ISBN:
(纸本)0780377958
ieee organization defined a standard for floating-point arithmetic, used by processingsystems, in its directive 754 [1]. This directive encodes floating point numbers using a maximum of 64 bit: 23 bit of fractional as single precision format and 52 bit of fractional as double precision format. The new multimedia terminals require low-power applications;the most important floating-point units (adders and multipliers) represent a significant part of total power wasted by a modem System-On-Chip. They might dissipate less power, using a reduced format representation. To verify this possibility, real systems simulate floating - point operations using different formats. In this conference paper, multimedia systems operate in different scenarios: wireless communication and image manipulation.
Channel coding is an important building block in communication systems since it ensures the quality of service. Irregular Repeat-Accumulate (IRA) codes belong to the class of Low-Density Parity-Check (LDPC) codes and ...
详细信息
ISBN:
(纸本)0780385047
Channel coding is an important building block in communication systems since it ensures the quality of service. Irregular Repeat-Accumulate (IRA) codes belong to the class of Low-Density Parity-Check (LDPC) codes and can even outperform the recently introduced Turbo-Codes of current communication standards. implementation complexity like area, achievable throughput of these channel coding schemes will have a major impact on the decision of standardization committees. In this paper we investigate implementation issues of IRA codes and analyze the strong interdependency of code performance and architectural dependencies, like throughput and area. We present an architecture template which is capable to decode hardware optimized IRA codes which can outperform Turbo-Codes (TC). We demonstrate this new approach through instances synthesized in a 0.13 mum technology.
This paper is concerned with the inversion of implementations for systems that may generally be nonlinear and time-varying. Specifically, techniques are presented for modifying an implementation of a forward system, r...
详细信息
ISBN:
(纸本)9781612842271
This paper is concerned with the inversion of implementations for systems that may generally be nonlinear and time-varying. Specifically, techniques are presented for modifying an implementation of a forward system, represented as an interconnection of subsystems, in such a way that an implementation for the inverse system is obtained. We focus on a class of modifications that leave subsystems in the inverse system unchanged with respect to those in the forward implementation. The techniques are therefore well-suited to the design of matched pre-emphasis and de-emphasis filters, as approximations due to coefficient quantization in the forward system are naturally matched in the inverse. In performing the inversion, an explicit input-output characterization of the system is not required, although the forward system must be known to be invertible. The techniques are applied to the inversion of nonlinear and time-varying systems, as well as to the problem of sparse matrix inversion.
Adaptive filters are used in many applications of digital signalprocessing. Digital communications and digital video broadcasting are just two examples. The GSFAP algorithm, discussed in the paper, is characterized b...
详细信息
ISBN:
(纸本)9781424403820
Adaptive filters are used in many applications of digital signalprocessing. Digital communications and digital video broadcasting are just two examples. The GSFAP algorithm, discussed in the paper, is characterized by convergence superior to the popular NLMS, with only slightly higher complexity. The paper deals with floating-point-like implementation of algorithm using FPGA hardware. We present an optimized core for the GSFAP, built using logarithmic arithmetic which provides very low cost multiplication and division. The design is crafted to make efficient use of the pipelined logarithmic addition units. The resulting GSFAP core can be clocked at more than 80 MHz on the one million gate Xilinx XC2V1000-4 device. It can be used to implement filters of orders 20 to 1000 with a sampling rate exceeding 50 kHz. For comparison, we implemented a similar NLMS core and found that although it is slightly smaller than the GSFAP core and it allows a higher signal sampling rate (around 70 kHz) for the corresponding filter orders, GSFAP has adaptation properties that are much superior to NLMS, and that our core can provide very sophisticated adaptive filtering capabilities for resource-constrained embedded systems.
In this paper, an area-efficient JPEG 2000 codec is implemented. on 6.1 mm(2) with 0.18 mu m CMOS technology dissipating 180 mW at 1.8 V and 60 MHz. It is capable or processing 78 MS/s for lossy coding at 1 bpp and 50...
详细信息
ISBN:
(纸本)9781424403820
In this paper, an area-efficient JPEG 2000 codec is implemented. on 6.1 mm(2) with 0.18 mu m CMOS technology dissipating 180 mW at 1.8 V and 60 MHz. It is capable or processing 78 MS/s for lossy coding at 1 bpp and 50 MS/s for lossless coding. Four techniques are used to implement this chip. The pre-compression rate-distortion optimization (pre-RDO) determine truncation points before coding to reduce computations for the EDC. The dataflow conversion and embedded compression reduces the tile memory bandwidth. The bit-plane parallel context formation enables scalable bit-plane coding. Experimental results shows this chip has higher area efficiency than the previous works.
In a cost sensitive consumer electronics market, the most cost-efficient implementation of the digital satellite set-top box is required in order to be competitive. The system functionality and the implementation rati...
详细信息
ISBN:
(纸本)0780338065
In a cost sensitive consumer electronics market, the most cost-efficient implementation of the digital satellite set-top box is required in order to be competitive. The system functionality and the implementation rational of the set-top box is described. Tradeoffs regarding hardware vs. software implementation are examined. Examples of how the system design affects the size and bandwidth of the memory requirements are given. Although the current generation of processors are capable of implementing real-time video decoding, they still are not a cost-effective solution for the digital set-top box.
Filter bank processing techniques based on MDCT/IMDCT have been widely adopted in various audio codec standards. Most published IMDCT computing algorithms focus mainly on either the reduction of computing complexity b...
详细信息
ISBN:
(纸本)0780377958
Filter bank processing techniques based on MDCT/IMDCT have been widely adopted in various audio codec standards. Most published IMDCT computing algorithms focus mainly on either the reduction of computing complexity but overlook the hardware realization issues, e.g. memory access complexity and the efficient mapping of computing kernel. In this paper, by exploiting the symmetric properties in computation, we first convert an N-point IMDCT to an N/2-point DCT-II problem. Fast DCT-II computing scheme is next derived and the overall scheme is further optimized to remove redundancy. Based on the proposed fast IMDCT computing scheme, a novel design mapping is developed to minimize memory access complexity without stalling the pipelined computation. The mapping features simple address generation, small temporary storage size and low access bandwidth. Performance analyses show that, given the same hardware resource allocation, the proposed design can outperform other well known IMDCT designs in terms of memory storage size, computing latency or fixed point implementation error.
The synthesis and mapping of user designs to configurable hardware is typically performed by heuristics. These approaches analyze the decomposability of the combinational user functions as a starting point and derive ...
详细信息
ISBN:
(纸本)9781509033614
The synthesis and mapping of user designs to configurable hardware is typically performed by heuristics. These approaches analyze the decomposability of the combinational user functions as a starting point and derive appropriate mappings to LUT structures or pull them in from pre-computed implementation libraries. In everyday use, this generally achieves a very competitive trade-off between the time spent for the synthesis and the quality of the produced implementations. A higher pressure on the optimality of an implementation exists when implementation libraries are generated or when critical kernels that are extensively duplicated in a massively parallel design are implemented. In these cases, a formal statement that an implementation within fewer LUTs or a smaller combinational depth is strictly impossible is very valuable. We present a tool that formulates the task of mapping a user design to a configurable hardware structure as a quantified boolean formula (QBF). It then uses a QBF solver to either compute an implementing configuration or to know for sure that the desired functionality cannot be implemented within the provided hardware. In the context of this tool, we also describe different approaches to model configurable interconnects and present the impact of these modelling approaches on the time needed to solve the mapping tasks.
暂无评论