This paper presents a high performance design for Context-Based Adaptive- Variable Length-Coding (CAVLC) used in the H.264/AVC standard. To reduce the cycles of processing one macroblock (MB), a two-stage residual...
详细信息
This paper presents a high performance design for Context-Based Adaptive- Variable Length-Coding (CAVLC) used in the H.264/AVC standard. To reduce the cycles of processing one macroblock (MB), a two-stage residual encoder is proposed to make the scan and encode stage work simultaneously. The scan engine scans two coefficients at each cycle. Parallel encoder for two levels and parallel encoder for two runs are adopted to accelerate the encoder engine. Only 228 cycles at most are needed to process one MB. Due to the existence coded block pattern (CBP) decided skip block mode, our experiment shows only 160 cycles are needed on the average. The proposed CAVLC encoder can support 4Kx2K @30fps (frames per second) real-time encoding at 250 MHz and the gate count is only about 16k.
In this paper, genetic algorithm (GA) using parallel tabular technique is presented for the optimization of mixed polarity Reed Muller and mixed polarity dual Reed Muller functions. The algorithm is to find optimal so...
详细信息
This paper presents a security processor based on MIPS 4KE architecture which extends security functions of AES and ECC. Due to the different features of AES and ECC encryptions, two dedicated hardware units are e...
详细信息
This paper presents a security processor based on MIPS 4KE architecture which extends security functions of AES and ECC. Due to the different features of AES and ECC encryptions, two dedicated hardware units are employed. One is the AES function unit which is integrated into the pipeline of this MlPS-like processor, and the other is the ECC unit which works as a coprocessor to implement asymmetric cryptographic algorithms. Moreover, the instruction set extensions(ISE) of MIPS for these security functions are developed. Therefore, our security processor is not only able to handle high-intensity encryption tasks, but also compatible to the leading software development tools of industry. At last, its functionality and high performance are verified by our experimental chip.
In this paper, a hardware/software co-design approach is proposed to parse the video bitstream which conforms to various video compression standards. The layered structure of the syntax elements in video bitstream...
详细信息
In this paper, a hardware/software co-design approach is proposed to parse the video bitstream which conforms to various video compression standards. The layered structure of the syntax elements in video bitstreams is analyzed. Then a hardware/software partition is proposed accordingly. Due to the high data rate, syntax elements in slice data and lower layers are commonly parsed by hardware. As for syntax elements in slice header and upper layers, we proposed a hw/sw co-design approach in order to combine the advantage of hardware acceleration and software flexibility, specific hardware accelerators are designed to parse these codes. But the parsing process of these codes in slice header and upper layer is controlled by software instead of hardware Finite state machine (FSM). This approach can speed up the process of Variable-Length Decoding (VLD) while it still has the flexibility to support multiple video coding standards.
There is a growing tendency for FPGA (Field Programmable Gate Array) IP (Intellectual Property) cores to be embedded in an SOC (system On a Chip). The embedded FPGA cores' can improve the flexibility of the SO...
详细信息
There is a growing tendency for FPGA (Field Programmable Gate Array) IP (Intellectual Property) cores to be embedded in an SOC (system On a Chip). The embedded FPGA cores' can improve the flexibility of the SOC chip. However, different SOC varies in the demands on the scale of FPGA tile array.. Therefore, a scalable FPGA generator is required. In this paper, an automatic layout generator to support user-defined FPGA array size is introduced and compared with the previous related works. This paper shows that the proposed layout generator based on FPGA tiles is more practical than the previous tools.
Through-Silicon Via (TSV) is a technology that enables vertical integration of silicon dies forming a single 3D-IC stack. In this paper, a practical model is proposed for the TSV assignment problem of the stacked-die ...
详细信息
Adaptive support-weight algorithm can generate high quality disparity map for stereo matching. But due to the complexity, it requires large internal memory size and bandwidth to meet the real-time constraint. In t...
详细信息
Adaptive support-weight algorithm can generate high quality disparity map for stereo matching. But due to the complexity, it requires large internal memory size and bandwidth to meet the real-time constraint. In this paper, we first analyze the requirements of this algorithm from the hardware perspective. Then we propose our Support-Weight Window Reuse (SWWR) technique which can shorten computation time by the number of disparities, and Left-Right Cost Reuse (LRCR) to achieve bandwidth reduction by more than half. The comparison states that our proposed flow can generate much better disparity results, and meets the real-time constraint with relatively low memory cost and bandwidth.
We introduce 10×10Gb/s RZ-DPSK-DWDM links in the architecture integration of MIMO-OFDM *** optical communication performance and wireless communication performance is analyzed in the integrated *** long distance ...
详细信息
We introduce 10×10Gb/s RZ-DPSK-DWDM links in the architecture integration of MIMO-OFDM *** optical communication performance and wireless communication performance is analyzed in the integrated *** long distance transmission for RZ-DPSK WDM is done,and the demodulated RZ-DPSK signal is transformed to the binary data,which is given to MIMO-OFDM-QPSK wireless communication *** MIMO-OFDMQPSK wireless system is designed with two transmit antennas and one receive antenna and four transmit antennas and one receive *** BER performance of almost 10-3 is achieved at SNR of 11 d B with four transmit antennas and one receive antenna which is better as compared to two transmit antennas and one receive *** results prove that seamless integration of RZ-DPSK-DWDM optical links with MIMO-OFDM system is suitable to simultaneously offer higher receiver performance and simplify system configuration for 4th generation wide-Area coverage mobile communication.
A hardware/software co-processing system for speech recognition applications is proposed in this paper. The system consists of a soft-core microprocessor and a dedicated hardware accelerator implemented on an FPGA...
详细信息
A hardware/software co-processing system for speech recognition applications is proposed in this paper. The system consists of a soft-core microprocessor and a dedicated hardware accelerator implemented on an FPGA. This system is intended to be used in embedded devices. By offloading computation-intensive parts of the speech recognition system. to the hardware accelerator, both faster recognition speed and lower power consumption are achieved without degrading recognition accuracy. The design is described in Verilog HDL and synthesized on a Xilinx Virtex-5 FPGA. Tests show that the proposed system runs 2.18 times faster than a pure software system.
A novel bandgap reference (BGR) with ultra low supply voltage is presented. The proposed bandgap reference uses subthreshold MOSFETs to provide temperature compensation. Analysis and comparison between proposed ba...
详细信息
A novel bandgap reference (BGR) with ultra low supply voltage is presented. The proposed bandgap reference uses subthreshold MOSFETs to provide temperature compensation. Analysis and comparison between proposed bandgap and conventional current-mode bandgap are made, and it is shown that when working with low supply voltage, the proposed bandgap is less sensitive to mismatch and power supply noise. The bandgap reference is implemented in SMIC 0.13μm RF technology, and simulation results show that it can provide the output voltage of 429 mV with a supply voltage as low as 0.6 V.
暂无评论