检索结果-内蒙古大学图书馆

SPIE Conference on advanced signal processing algorithms, architectures, and implementations VIII

作者： Ma, J Parhi, KK Hekstra, GJ Deprettere, EF Univ Minnesota Dept ECE Minneapolis MN 55455 USA

ISBN: (纸本)0819429163

CORDIC based IIR digital filters are orthogonal filters whose internal computations consist of orthogonal transformations. These filters possess desirable properties for VLSI implementations such as regularity, local connection, low sensitivity to finite word-length implementation, and elimination of limit cycles. Recently, fine-grain pipelined CORDIC based IIR digital filter architectures which can perform the filtering operations at arbitrarily high sample rates at the cost of linear increase in hardware complexity have been developed. These pipelined architectures consist of only Givens rotations and a few additions which can be mapped onto CORDIC arithmetic based processors. However, in practical applications, implementations of Givens rotations using traditional CORDIC arithmetic are quite expensive. For example, for 16 bit accuracy, using floating point data format with 16 bit mantissa and 5 bit exponent, it will require approximately 20 pairs of shift-add operations for one Givens rotation. In this paper, we propose an efficient implementation of pipelined CORDIC based IIR digital filters based on fast orthonormal mu-rotations. Using this method, the Givens rotations are approximated by angles corresponding to orthonormal mu-rotations, which are based on the idea of CORDIC and can perform rotation with minimal number of shift-add operations. We present various methods of construction for such orthonormal mu-rotations. A significant reduction (over 76%) of the number of required shift-add operations is achieved. All types of fast rotations can be implemented as a cascade of only four basic types of shift-add stages. These stages can be executed on a modified floating-point CORDIC architecture, making the pipelined filter highly suitable for VLSI implementations.

关键词： orthogonal IIR digital filters pipelining approximate rotations orthonormal mu-rotations CORDIC architecture

来源：评论

学校读者我要写书评

暂无评论

ADAPTIVE LATTICE FILTER implementations ON PIPELINED MULTIPROCESSOR architectures

引用

IEEE TRANSACTIONS ON COMMUNICATIONS 1990年第1期38卷 122-124页

作者： MEYER, MD AGRAWAL, DP Computer Systems Laboratory Electrical and Computer Engineering Department North Carolina State University Raleigh NC USA

Three pipelined multiprocessor implementations of adaptive lattice filters are examined. A performance analysis is done for each multiprocessor system, and expressions for approximate computation time and speedup are derived. Each system is shown to be flexible with respect to the different algorithms and the different filter sizes it can implement.< >

关键词： Lattices Adaptive filters signal processing algorithms Computer architecture Transversal filters Feedback Reflection Hardware Filtering Least squares methods

来源：评论

学校读者我要写书评

暂无评论

Efficient implementations for AES Encryption and Decryption

引用

CIRCUITS SYSTEMS AND signal processing 2012年第5期31卷 1765-1785页

作者： Rachh, Rashmi Ramesh Mohan, P. V. Ananda Anami, B. S. ECIL R&D Bangalore 560052 Karnataka India KLE Soc Coll Engn & Technol Dept Comp Sci Belgaum 590008 India KLE Inst Technol Hubli India

This paper proposes two efficient architectures for hardware implementation of the advanced Encryption Standard (AES) algorithm. The composite field arithmetic for implementing SubBytes (S-box) and InvSubBytes (Inverse S-box) transformations investigated by several authors is used as the basis for deriving the proposed architectures. The first architecture for encryption is based on optimized S-box followed by bit-wise implementation of MixColumns and AddRoundKey and optimized Inverse S-box followed by bit-wise implementation of InvMixColumns and AddMixRoundKey for decryption. The proposed S-box and Inverse S-box used in this architecture are designed as a cascade of three blocks. In the second proposed architecture, the block III of the proposed S-box is combined with the MixColumns and AddRoundKey transformations forming an integrated unit for encryption. An integrated unit for decryption combining the block III of the proposed InvSubBytes with InvMixColumns and AddMixRoundKey is formed on similar lines. The delays of the proposed architectures for VLSI implementation are found to be the shortest compared to the state-of-the-art implementations of AES operating in non-feedback mode. Iterative and fully unrolled sub-pipelined designs including key schedule are implemented using FPGA and ASIC. The proposed designs are efficient in terms of Kgates/Giga-bits per second ratio compared with few recent state-of-the-art ASIC (0.18-mu m CMOS standard cell) based designs and throughput per area (TPA) for FPGA implementations.

关键词： advanced Encryption Standard Encryption Decryption FPGA implementation VLSI architectures

来源：评论

学校读者我要写书评

暂无评论

FPGA implementations of HEVC Inverse DCT Using High-Level Synthesis

FPGA Implementations of HEVC Inverse DCT Using High-Level Sy...

引用

Conference on Design and architectures for signal & Image processing

作者： Kalali, Ercan Hamzaoglu, Ilker Sabanci Univ Fac Engn & Nat Sci TR-34956 Istanbul Turkey

ISBN: (纸本)9791092279115

High Efficiency Video Coding (HEVC), the recently developed international video compression standard, has 50% better video compression efficiency than H.264 video compression standard at the expense of significantly increased computational complexity. HEVC Inverse Discrete Cosine Transform (IDCT) algorithm accounts for 11% of the computational complexity of an HEVC video encoder. Recently, commercial and academic high-level synthesis (HLS) tools are started to be successfully used for FPGA implementations of digital signal processing algorithms. Therefore, in this paper, the first FPGA implementations of HEVC 2D IDCT algorithm using HLS tools in the literature are proposed. The proposed HEVC IDCT hardware are implemented on Xilinx FPGAs using three HLS tools;Xilinx Vivado HLS, LegUp, MATLAB Simulink HDL Coder. Using HLS tools significantly reduced the FPGA development time, and the resulting FPGA implementations achieved real-time performance. Therefore, HLS tools can be used for FPGA implementation of HEVC video encoder.

关键词： HEVC IDCT FPGA Implementation HLS

来源：评论

学校读者我要写书评

暂无评论

ON THE PARALLEL implementations OF THE LINEAR KALMAN AND LAINIOTIS FILTERS AND THEIR EFFICIENCY

引用

signal processing 1991年第3期25卷 289-305页

作者： KATSIKAS, SK LIKOTHANASSIS, SD LAINIOTIS, DG Technological Education Institute of Athens Department of Computer Science Ag. Spyridonas 12210 Egaleo Athens Greece University of the Aegean Department of Mathematics Karlovassi 83200 Samos Greece Technological Education Institute of Patras School of Applied Technology Koukouli Patras Greece University of Patras Department of Computer Engineering 26500 Patras Greece Florida Institute of Technology Department of Electrical and Computer Engineering Melbourne FL 32901-6988 USA INDECON Advanced Technology 9–11 Laodikias & Michalacopoulou St. Athens 11528 Greece

In this paper, the parallel implementations of two well-known linear state-space filtering algorithms, namely the Kalman and the Lainiotis filters, in MIMD machines are studied from a computational standpoint. The analysis assumes both time invariant and time varying system models and uses precedence graphs and critical paths. The parallelism efficiency of the implementations is also defined and studied. Results indicate that these algorithms can be implemented in parallel using a comparatively small number of processors. Furthermore, the efficiency of the parallel implementations can be very high or very low, depending on the state and measurement vector dimensions.

关键词： KALMAN FILTERS LAINIOTIS FILTERS PARALLEL implementations

来源：评论

学校读者我要写书评

暂无评论

implementations of Sorted-QR Decomposition for MIMO Receivers: Complexity, Reusability and Efficiency Analysis

引用

JOURNAL OF signal processing SYSTEMS FOR signal IMAGE AND VIDEO TECHNOLOGY 2012年第1期69卷 41-53页

作者： Ramakrishnan, Venkatesh Veerkamp, Tobias Ascheid, Gerd Adrat, Marc Antweiler, Markus Rhein Westfal TH Aachen Inst Commun Technol & Embedded Syst D-52062 Aachen Germany Fraunhofer Gesell Informat Proc & Ergonon FKIE Fraunhofer Inst Commun D-53343 Wachtberg Germany

Matrix decomposition of the channel matrix in the form of QR decomposition (QRD) is needed for advanced multiple input and multiple output (MIMO) demapping algorithms like sphere decoder. Due to the computation-intensive nature of the QRD, its implementation has to be highly efficient. Flexibility in several forms, e.g. support for different algorithms, reusability of wireless implementations, portability, etc. is highly sought in wireless devices. The contradictory nature of flexibility and efficiency requires tradeoffs to be made between them in system development. In this paper, we have analyzed such tradeoffs by implementing two minimum mean squared error-sorted QRD algorithms. The algorithms have been implemented in four different methods with varying degree of reusability and in five different forms of portability. The performance of the implementations is evaluated by using the real-time constraints from the LTE standard. For all the implementations, modular equations for accurately estimating the execution time are derived.

关键词： MIMO Sorted QRD MMSE-SQRD Modified Gram Schmidt Givens rotation CORDIC Software defined radio Mapping exploration Flexibility Portability Reusability Efficiency

来源：评论

学校读者我要写书评

暂无评论

Matrix approach for fast implementations of logarithmic MAP decoding of turbo codes

Matrix approach for fast implementations of logarithmic MAP ...

引用

IEEE Pacific Rim Conference on Communications, Computers and signal processing (PACRIM 2001)

作者： Wang, DY Kobayashi, H Princeton Univ Dept Elect Engn Princeton NJ 08544 USA

ISBN: (纸本)0780370805

This paper describes two new matrix transform algorithms for the Max-Log-MAP decoding of turbo codes. In the proposed algorithms, the successive decoding procedures carried out in the conventional Max-Log-MAP algorithm are performed in parallel, and well formulated into a set of simple and regular matrix operations, which can therefore considerably speed up the decoding operations and reduce the computational complexity. The matrix Max-Log-MAP algorithms also maintain the advantage of the general logarithmic MAP like algorithms in avoiding complex numerical representation problems. They particularly facilitate the implementations of the logarithmic MAP like algorithms in special-purpose parallel processing VLSI hardware architectures. The matrix algorithms also allow simple implementations by using shift registers. The proposed implementation architectures for the matrix Max-Log-MAP decoding can effectively reduce the memory capacity and simplify the data accesses and transfers required by the conventional Max-Log-MAP as well as MAP algorithms.

关键词： turbo codes MAP algorithm Max-Log-MAP algorithm matrix transform VLSI

来源：评论

学校读者我要写书评

暂无评论

Parallel implementations with low-complexity of rotation-based adaptive filters

Parallel implementations with low-complexity of rotation-bas...

引用

IEEE Workshop on signal processing Systems Design and implementations (SiPS 05)

作者： Bhouri, M Institut Supérieur d'Informatique et de Mathématiques

ISBN: (纸本)0780393333

This paper propose a new parallel implementation of some rotation-based adaptive filters [1]. These filters are characterized by a robust behavior to input signal correlation [2] and good numerical properties. However, their implementations have reduced complexities. The circuits based on these block-diagonal adaptive algorithms use less computing cells than the systolic circuit of the QR-RLS algorithm. Nevertheless, these new and low-complexity architectures have no longer a pipeline structure.

关键词： Adaptive filtering

来源：评论

学校读者我要写书评

暂无评论

Area, delay, and power characteristics of standard-cell implementations of the AES S-Box

引用

JOURNAL OF signal processing SYSTEMS FOR signal IMAGE AND VIDEO TECHNOLOGY 2008年第2期50卷 251-261页

作者： Tillich, Stefan Feldhofer, Martin Popp, Thomas Grosschadl, Johann Univ Bristol Dept Comp Sci Bristol BS8 1UB Avon England Graz Univ Technol Inst Appl Informat Proc & Commun A-8010 Graz Austria

Cryptographic substitution boxes (S-boxes) are an integral part of modern block ciphers like the advanced Encryption Standard (AES). There exists a rich literature devoted to the efficient implementation of cryptographic S-boxes, wherein hardware designs for FPGAs and standard cells received particular attention. In this paper we present a comprehensive study of different standard-cell implementations of the AES S-box with respect to timing (i.e. critical path), silicon area, power consumption, and combinations of these cost metrics. We examine implementations which exploit the mathematical properties of the AES S-box, constructions based on hardware look-up tables, and dedicated low-power solutions. Our results show that the timing, area, and power properties of the different S-box realizations can vary by up to almost an order of magnitude. In terms of area and area-delay product, the best choice are implementations which calculate the S-box output. On the other hand, the hardware look-up solutions are characterized by the shortest critical path. The dedicated low-power implementations do not only reduce power consumption by a large degree, but they also show good timing properties and offer the best power-delay and power-area product, respectively.

关键词： advanced Encryption Standard (AES) substitution box (S-box) inversion in the finite field GF($2(8)$) standard cell implementation silicon area critical path delay power consumption

来源：评论

学校读者我要写书评

暂无评论

Exploration of Soft-Output MIMO Detector implementations on Massive Parallel Processors

引用

JOURNAL OF signal processing SYSTEMS FOR signal IMAGE AND VIDEO TECHNOLOGY 2011年第1期64卷 75-92页

作者： Fasthuber, Robert Li, Min Novo, David Raghavan, Praveen Van Der Perre, Liesbet Catthoor, Francky IMEC B-3001 Louvain Belgium

Emerging Software Defined Radio (SDR) baseband platforms are based on multiple processors with massive parallelism. Although the computational power of these platforms would theoretically enable SDR solutions with advanced wireless signal processing, existing work implements still rather basic algorithms. For instance, current Multiple-Input Multiple-Output (MIMO) detector implementations are typically based on simple linear hard-output and not on advanced near-Maximum Likelihood (ML) soft-output detection. However, only the latter enables to exploit the full potential of MIMO technology. In this work, we explore the feasibility of advanced soft-output near-ML MIMO detectors on massive parallel processors. Although such detectors are considered to be very challenging due to their high computational complexity, we combine architecture-friendly algorithm design, application specific instructions and instruction-level/data-level parallelism explorations to make SDR solutions feasible. We show that, by applying the proposed combination of techniques, it is possible to obtain SDR implementations which can deliver data rates that are sufficient for future wireless systems. For example, a 2 x 4 Coarse Grain Array (CGA) processor with 16-way Single Instruction Multiple Data (SIMD) can deliver 192/368 Mbps throughput for 2 x 2 64/16-QAM transmissions. Finally, we estimate the area and power consumption of the programmable solution and compare it against a traditional Application Specific Integrated Circuit (ASIC) approach. This enables us to draw conclusions from the cost perspective.

关键词： MIMO SDR SSFE LLR CGA ASIC

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：