This paper addresses a new kind of security vulnerable spots introduced by Network-on-chip (NoC) use in System-on-Chip (SoC) design. This study is based on the experience of a CAD framework for NoC design and proposes...
详细信息
ISBN:
(纸本)0780393333
This paper addresses a new kind of security vulnerable spots introduced by Network-on-chip (NoC) use in System-on-Chip (SoC) design. This study is based on the experience of a CAD framework for NoC design and proposes a classification of weaknesses with regard to usual routing and interface techniques. Finally design strategies are proposed and a new path routing technique (SCP) is introduced with the aim to enforce security.
We studied the efficient implementation of a motion estimation algorithm for H.264/AVC on TMS 320C64x, a VLIW (Very Long Instruction Word) SIMD (Single Instruction Multiple Data) digital signal processor. H.264 motion...
详细信息
ISBN:
(纸本)0780393333
We studied the efficient implementation of a motion estimation algorithm for H.264/AVC on TMS 320C64x, a VLIW (Very Long Instruction Word) SIMD (Single Instruction Multiple Data) digital signal processor. H.264 motion estimation algorithms demand much arithmetic operations especially because of the variable block size optimization. The SAD (Sum of Absolute Difference) reuse method is chosen not only to reduce the computation but also to utilize the regular algorithmic structure, which is essential for efficient implementation in parallel and pipelined processors. We applied a few techniques, such as loop length increase for efficient software pipelining, multiblock SAD computation for reducing memory access overhead, block processing for cache miss minimization, and improved quarter-pixel processing. The implementation results show that a real-time implementation of Me for D1 size (720*480) video is possible using a 720MHz TMS320C6416 digital signal processor.
This paper proposes an asynchronous multi-core architecture for embedded systems using partial differential equations-based image processing algorithms. The study of data flow and the timing analysis is carried out in...
详细信息
This paper proposes an asynchronous multi-core architecture for embedded systems using partial differential equations-based image processing algorithms. The study of data flow and the timing analysis is carried out in order to reveal optimal global architecture specifications. The global architecture uses a semi-parallel approach with several processing units running in parallel and shared memory blocks. The results are illustrated by the implementation of a continuous watershed transform, followed by a discussion of the measured execution time and the computational load to demonstrate the efficiency.
The Field Programmable Gate Array (FPGA) implementation of the commonly used Histogram of Oriented Gradients (HOG) algorithm is explored. The HOG algorithm is employed to extract features for object detection. A key f...
详细信息
ISBN:
(纸本)9781479965885
The Field Programmable Gate Array (FPGA) implementation of the commonly used Histogram of Oriented Gradients (HOG) algorithm is explored. The HOG algorithm is employed to extract features for object detection. A key focus has been to explore the use of a new FPGA-based processor which has been targeted at image processing. The paper gives details of the mapping and scheduling factors that influence the performance and the stages that were undertaken to allow the algorithm to be deployed on FPGA hardware, whilst taking into account the specific IPPro architecture features. We show that multi-core IPPro performance can exceed that of against state-of-the-art FPGA designs by up to 3.2 times with reduced design and implementation effort and increased flexibility all on a low cost, Zynq programmable system.
The proceedings contains 60 papers from the 1998 ieeeworkshop on signalprocessingsystems (SiPS 98) - design and implementation. Topics discussed include: cache systems in video signalprocessing;real time software ...
详细信息
The proceedings contains 60 papers from the 1998 ieeeworkshop on signalprocessingsystems (SiPS 98) - design and implementation. Topics discussed include: cache systems in video signalprocessing;real time software video encoders;system-on-a-chip design of low-power smart vision systems;hierarchical watermarking techniques;hand-held multimedia terminals;high resolution digital image acquisition using wavelet image compression;enhanced template matching algorithms;filter banks;discrete time-frequency distribution;multichannel reverberation for computer music applications;scheduling strategies for low-energy programmable digital-serial Reed-Solomon codecs;and digital demodulator matching.
One of the key technologies for spoken language processing is the automatic synthesis of speech. For an important number of current or future applications (including various telecommunication services and voice interf...
详细信息
One of the key technologies for spoken language processing is the automatic synthesis of speech. For an important number of current or future applications (including various telecommunication services and voice interfaces for mobile devices), the synthesis of good quality speech starting from unrestricted text as well as the efficient implementation of the corresponding synthesis systems still represent very difficult tasks. This paper presents an optimized implementation of a text-to-speech synthesis system for the Romanian language using a Motorola development platform built around a StarCore SC140-based processor. The paper emphasizes the key requirements for such an embedded implementation (especially the intelligibility/footprint combination), the problems that were encountered and the solutions found to these problems.(1)
Multiband orthogonal frequency-division multiplexing (MB-OFDM) systems employ frequency-hopping technology to achieve the capabilities of multiple access and frequency diversity. However, they also complicate packet d...
详细信息
ISBN:
(纸本)9781424403820
Multiband orthogonal frequency-division multiplexing (MB-OFDM) systems employ frequency-hopping technology to achieve the capabilities of multiple access and frequency diversity. However, they also complicate packet detector (PD) in terms of the requirement for the high hardware complexity. In this paper, we propose several low-cost design schemes for the PD, such as Walsh-Hadamard decomposition, buffered summation, and sign-bit-remaining methods. The estimated gate count of the resulting implemented PD is less than half that of existing solutions.
This paper reviews the architectural enhancements to the second generation of a VLIW media processor. The concept of a media processor is introduced and its application in an x86 family personal computer platform is d...
详细信息
This paper reviews the architectural enhancements to the second generation of a VLIW media processor. The concept of a media processor is introduced and its application in an x86 family personal computer platform is described. The architectural choices made in the original Mpact media processor are explained as are how they were extended in the latest version based on design experience and changing requirements.
Low-Density Parity-Check (LDPC) codes have been adopted in the physical layer of many communication systems because of their superior performance. The direct implementation of these codes onto an existing software def...
详细信息
ISBN:
(纸本)9781424403820
Low-Density Parity-Check (LDPC) codes have been adopted in the physical layer of many communication systems because of their superior performance. The direct implementation of these codes onto an existing software defined radio (SDR) platform is likely to be inefficient. Our approach is to design the LDPC code to match the constraints imposed by the existing architecture, without compromising the communication performance. We present a procedure for architecture-aware code design that involves feature identification, code construction and verification. Details of the procedure for the case when the SDR platform is equipped with a multi-stage interconnection network (MIN) is presented. By analyzing the characteristics of the MIN, simple yet explicit constraints are derived and used in the code construction step. The resulting LDPC code can not only be mapped very efficiently onto the SDR platform but also has very good bit error rate (BER) performance.
The Koetter-Vardy algorithm is an algebraic soft-decision decoding algorithm for Reed-Solomon codes. Software implementations of the Koetter-Vardy algorithm are considered as part of a redecoding architecture that aug...
详细信息
ISBN:
(纸本)0780393333
The Koetter-Vardy algorithm is an algebraic soft-decision decoding algorithm for Reed-Solomon codes. Software implementations of the Koetter-Vardy algorithm are considered as part of a redecoding architecture that augments a hardware hard-decision decoder with soft-decision decoding software on an embedded processor. In this paper we investigate the implementation of the interpolation step of the Koetter-Vardy algorithm on SIMD processor architectures. A parallelization of the algorithm is given using the K'th order Horner's rule for parallel polynomial evaluation. The SIMD algorithm has a running time 2.5 to 4 times faster than a serial implementation on a DSP processor. To gain further speedup we propose a merged-SIMD architecture that calculates the Hasse derivative in parallel with the polynomial updates.
暂无评论