Motion estimation consumes the major part of time and power in both video compression standards - HEVC and H.264. This paper presents a Fast Motion Estimation algorithm, which targets Full Search quality even at HD re...
详细信息
ISBN:
(纸本)9781509033614
Motion estimation consumes the major part of time and power in both video compression standards - HEVC and H.264. This paper presents a Fast Motion Estimation algorithm, which targets Full Search quality even at HD resolution. It is an enhancement of existing Fast Motion Estimation algorithms with the main purpose of reducing cost and power consumption of handheld devices performing Motion Estimation, while preserving the picture quality. The proposed algorithm is based on dimensionality reduction and uses locality sensitive hash functions to achieve "Quantitative Expression of Similarity". The algorithm is an extension to other existing algorithms in the literature and especially one of the most efficient algorithms - HMDS.
An energy aware DCT (Discrete Cosine Transform) architecture based on the distributed arithmetic concept is proposed. Architectures based on the distributed arithmetic concept are inherently low power as they are mult...
详细信息
ISBN:
(纸本)0780377958
An energy aware DCT (Discrete Cosine Transform) architecture based on the distributed arithmetic concept is proposed. Architectures based on the distributed arithmetic concept are inherently low power as they are multiplication free algorithms. One characteristic of the DCT is that signal energies are concentrated in only a few coefficients (less than 25%) upon transformation with the rest or 75% of the coefficients being insignificant and negligible. One can skip the computation of these terms without seriously affecting the output signal quality. Exploiting this idea, we propose an energy aware DCT architecture which adaptively trades off image quality with power dissipation. Our simulation results show that the new proposed architecture achieves 60% in power savings with a small degradation in signal quality.
Full-Dimension MIMO (FD-MIMO) technology has been shown to increase spectral efficiency 2-4X compared to current LTE systems by exploiting a large number of antennas to support high order multiuser MIMO. High order mu...
详细信息
ISBN:
(纸本)9781509033614
Full-Dimension MIMO (FD-MIMO) technology has been shown to increase spectral efficiency 2-4X compared to current LTE systems by exploiting a large number of antennas to support high order multiuser MIMO. High order multiuser MIMO with large number of antennas increase design and implementation complexity significantly. Furthermore, practical challenges such as antenna calibration for RF mismatches and failure need to be considered. In this paper, reduced-complexity precoding algorithm is introduced and optimized for real-time FPGA based implementation. Novel antenna calibration architecture is designed for FD-MIMO large 2-dimension array with the consideration of RF failure in practice. Field experimental results are also presented based on a proof of concept (PoC) FD-MIMO base-station (BS) with 32 antennas that supports up to 12 users and achieves spectral efficiency of similar to 21 bits/sec/Hz.
Bandwidth to off-chip memory is a scarce resource in complex systems-on-Chip for embedded media processing. We apply embedded compression for bandwidth-hungry image processing functions in order to alleviate this band...
详细信息
ISBN:
(纸本)9781424403820
Bandwidth to off-chip memory is a scarce resource in complex systems-on-Chip for embedded media processing. We apply embedded compression for bandwidth-hungry image processing functions in order to alleviate this bandwidth bottleneck. In our solution embedded compression is implemented as part of the System-on-Chip infrastructure, fully transparent for the hardware and software image processing components. Hence it can be applied without requiring changes to these components. We present the compression algorithm and demonstrate that we achieve significant bandwidth reductions (20% - 40%) for image data at acceptable cost (approximately 1 mm(2) in 90 nm CMOS) while preserving high image quality.
In state-of-the-art multimedia compression standards, arithmetic coding is widely used as a powerful entropy compression method. In the MPEG-4 standard a specific 4-symbol, multiple-context arithmetic coder is used fo...
详细信息
ISBN:
(纸本)0780371453
In state-of-the-art multimedia compression standards, arithmetic coding is widely used as a powerful entropy compression method. In the MPEG-4 standard a specific 4-symbol, multiple-context arithmetic coder is used for wavelet based image compression. In this paper we present an architecture capable of processing close to I symbol per cycle, managing multiple context in a simple, yet cost-efficient manner. A peak performance of 200 Mbit/s is achieved when clocking this architecture at 100 MHz.
In this paper, extended instructions for the advanced encryption standard (AES) cryptography acceleration in embedded processors and efficient implementation of these instructions are presented. These AES instructions...
详细信息
ISBN:
(纸本)0780385047
In this paper, extended instructions for the advanced encryption standard (AES) cryptography acceleration in embedded processors and efficient implementation of these instructions are presented. These AES instructions generate four elements in single-instruction, multiple-data format from each input of an AES state. The instruction count for 128-bit key AES encryption can be reduced from 688 to 340 per 128-bit block by using the proposed AES instructions. The execution unit for the AES instructions can be implemented efficiently with a single 2-Kbit table and four small multipliers. The capacity of the table has been reduced to 1/32, compared to that of a conventional fast software algorithm. The AES instructions enable embedded processors for low-cost network equipment to have cryptographic capability with minimal modification.
Currently, information security is an important issue in our information society and technology. In this paper, we propose two efficient architectures for processor of 128-bit block cipher SEED using 32-bit data bus. ...
详细信息
ISBN:
(纸本)0780377958
Currently, information security is an important issue in our information society and technology. In this paper, we propose two efficient architectures for processor of 128-bit block cipher SEED using 32-bit data bus. We compare the proposed architectures with the conventional SEED processor. The proposed SEED processors improve speed and reduce the hardware resources using only one G-function in the F-function and the key scheduler of SEED. The operation of the proposed methods has been verified with functional simulation, synthesis and tested on board. The proposed architecture is suited for hardware-critical applications, such as smart card, PDA, and mobile phone, etc.
Video segmentation is a key unit in content-based video encoding systems, such as MPEG-4. Existing algorithms are too complex for real-time applications, and hardware implementation is infeasible because of the global...
详细信息
ISBN:
(纸本)0780375874
Video segmentation is a key unit in content-based video encoding systems, such as MPEG-4. Existing algorithms are too complex for real-time applications, and hardware implementation is infeasible because of the global and irregular operations. In this paper, a hardware system for video segmentation is proposed from algorithm level to hardware architecture level. A hardware-oriented algorithm is first proposed to generate accurate object masks with local pixel operations and morphological operations, which are suitable for hardware implementation. After that, the hardware architecture is designed based on partial-result-reuse architecture and programmable morphology PE array architecture, which can achieve both high flexibility and throughput. A prototype chip is implemented to achieve the processing speed of 30 QCIF frames per second and 7,680 morphological operations per second at 26 MHz. It also shows the hardware cost is small, and the proposed video segmentation hardware system is suitable to be integrated into any content-based video encoding systems.
One of the first processing steps in a DVB-S2 signal receiver is the detection of frame's header. Recently, an architecture using only the phase information of the received samples was proposed. In this paper seve...
详细信息
ISBN:
(纸本)9781509033614
One of the first processing steps in a DVB-S2 signal receiver is the detection of frame's header. Recently, an architecture using only the phase information of the received samples was proposed. In this paper several optimization in algorithm/architecture are proposed, leading to better performance and reduced hardware complexity. For an SNR of -3 dB, the probability of miss detection of the header detector is reduced from 0.7 down to 0.52 for a constant false alarm probability of 10(-6).
This paper describes a new architecture for content-based, interactive multimedia applications. A hardware implementation of a Java Virtual Machine (JVM) is proposed, which allows for direct execution of Java bytecode...
详细信息
ISBN:
(纸本)0780338065
This paper describes a new architecture for content-based, interactive multimedia applications. A hardware implementation of a Java Virtual Machine (JVM) is proposed, which allows for direct execution of Java bytecode. In a single clock cycle, up to 3 bytecode instructions can be decoded and executed in parallel using a RISC pipeline. A splitable 64-bit ALU implementation addresses demanding processing requirements of typical multimedia signalprocessing schemes. The proposed architecture supports parallel execution of multiple Java threads, An implementation of basic building blocks of the processor with a standard-cell library provides an estimate of 150 MHz clock-speed for a 0.35 mu m 3 metal layer CMOS process. With a size of less than 10 mm(2) needed for the core logic, it is possible to integrate multiple JVMs together with larger cache memories on a single chip.
暂无评论