This paper presents the implementation of a wireless multimedia DSP chip for mobile applications. The implemented DSP chip supports communication instructions for Viterbi, timing synchronization, etc. as well as multi...
详细信息
ISBN:
(纸本)0780377958
This paper presents the implementation of a wireless multimedia DSP chip for mobile applications. The implemented DSP chip supports communication instructions for Viterbi, timing synchronization, etc. as well as multimedia instructions. The DSP can handle variable length data and perforin four MACs in a cycle. The proposed DSP employs parallel processing techniques, such as SIMD, vector processing, DSP schemes and adopts low power features for wireless applications. The implemented DSP chip includes test circuits and various peripherals, such as DMA, bus arbitration, timer, etc. This chip has been modeled by Verilog HDL and implemented using the 0.35 mum HCB60 library. The total gate count excluding memory is about 170,000 gates and the clock frequency is 100 MHz.
In this paper, the concept of Algebraic Mapping Network is introduced which will allow, for the first time, the inclusion of information about `when and where' within mathematical equations. This explicit mathemat...
详细信息
In this paper, the concept of Algebraic Mapping Network is introduced which will allow, for the first time, the inclusion of information about `when and where' within mathematical equations. This explicit mathematical description has the same structure as `implicit' mathematics and is used to map algebraic expressions into a Time-Space map. The main merit of AlMa-Net is that it is a unified mathematical framework for both software and hardware.
In this paper we present the VLSI implementation of a high-throughput enhanced Max-log-MAP processor that supports both single-binary (SB) and double-binary (DB) convolutional turbo codes. The combined hybrid-window (...
详细信息
ISBN:
(纸本)9781424429233
In this paper we present the VLSI implementation of a high-throughput enhanced Max-log-MAP processor that supports both single-binary (SB) and double-binary (DB) convolutional turbo codes. The combined hybrid-window (HW) and parallel-window (PW) MAP decoding is introduced to support arbitrary frame sizes with high throughput. A 1.28 mm(2) dual-mode (SB/DB) 2PW-1HW MAP processor is also implemented in TSMC 0.13 mu m CMOS process to verify the proposed approaches. The proposed MAP processor can be used as hardware accelerators in multistandard platform for wireless WAN with low cost and low energy.
In order to cope with the increasing number of functions that need to be implemented on a single chip as telecommunication products become more complex, a rapid trend towards programmable architectures as a base for d...
详细信息
ISBN:
(纸本)0780338065
In order to cope with the increasing number of functions that need to be implemented on a single chip as telecommunication products become more complex, a rapid trend towards programmable architectures as a base for digital signalprocessing (DSP) systems is occurring. The reason for this is that extremely complex algorithms and protocols must be implemented to economically use the available bandwidth for the next generation of wireless networks. The rapidly changing system requirements and design productivity and the intellectual property reuse are also promoting this trend.
A novel on-line Mixed-Scaling-Rotation CORDIC (MSR-CORDIC) VLSI architecture is proposed. This architecture not only maintains the scaling-free property of the original,MSR-CORDIC, but also achieves the target of on-l...
详细信息
ISBN:
(纸本)9781424403820
A novel on-line Mixed-Scaling-Rotation CORDIC (MSR-CORDIC) VLSI architecture is proposed. This architecture not only maintains the scaling-free property of the original,MSR-CORDIC, but also achieves the target of on-line angle computation. Compared with other existing CORDIC solutions, the proposed architecture is faster and more cost-efficient, especially for QRD-RLS filtering systems. Moreover, this on-line MSR-CORDIC can also be adopted by other rotation-based DSP applications.
This paper reviews the architectural enhancements to the second generation of a VLIW media processor. The concept of a media processor is introduced and its application in an x86 family personal computer platform is d...
详细信息
This paper reviews the architectural enhancements to the second generation of a VLIW media processor. The concept of a media processor is introduced and its application in an x86 family personal computer platform is described. The architectural choices made in the original Mpact media processor are explained as are how they were extended in the latest version based on design experience and changing requirements.
Fast and efficient operation is a major challenge for complex image processing algorithms executed in hardware. This paper describes novel algorithms for correcting optical geometric distortion in imaging systems, tog...
详细信息
ISBN:
(纸本)9781424403820
Fast and efficient operation is a major challenge for complex image processing algorithms executed in hardware. This paper describes novel algorithms for correcting optical geometric distortion in imaging systems, together with the architectures used to implement them in FPGA-based hardware. The proposed architecture produces a fast, almost real-time solution for the correction of image distortion implemented using VHDL HDL with a single XiIinx FPGA XCS3 10004 device. Using dedicated SRLC16 shift registers to build the synchronous FIFOs is an ideal utilization of the device resources available. The experimental results show that the barrel distortion can be quickly corrected with a very low residual error. The design can also be applied to other imaging processing algorithms in optical systems.
In this paper, we present an ARM based decoder for Low Density Parity Check (LDPC) codes. To maximize the efficiency of the parallel execution and fully utilize the ARM processors' capacity, instruction-level para...
详细信息
ISBN:
(纸本)9781538663189
In this paper, we present an ARM based decoder for Low Density Parity Check (LDPC) codes. To maximize the efficiency of the parallel execution and fully utilize the ARM processors' capacity, instruction-level parallelism optimizations are applied. Compression storage and efficient data prefetching optimizing methods are used to enhance the memory cache efficiency. The experiments are carried out on NVIDIA Jetson K1 platform. Our decoder can achieve 1.85 times speed up compared with the related work. And the decoder has better energy efficiency comparing to the decoders on x86-CPU and GPU.
In this paper we outline the main design features of a low complexity speech recognition engine targeted for mobile devices. Although major parts have already been presented, new features and important refinements of ...
详细信息
In this paper we outline the main design features of a low complexity speech recognition engine targeted for mobile devices. Although major parts have already been presented, new features and important refinements of the original ideas, which were omitted, are now described. We also show how these techniques can be successfully combined in order to achieve various design targets with minimized impact on the recognition performance.
Architecture enhancements to the C6000 architecture have improved performance, reduced code size, lowered power, and increased compiler efficiency. In this work, benchmarks of DSP kernels and typical DSP applications ...
详细信息
ISBN:
(纸本)0780371453
Architecture enhancements to the C6000 architecture have improved performance, reduced code size, lowered power, and increased compiler efficiency. In this work, benchmarks of DSP kernels and typical DSP applications are used to compare commercially available DSPs in terms of cycle count, power, and compiler efficiency.
暂无评论