检索结果-内蒙古大学图书馆

42nd Annual International Conference on Parallel processing (ICPP)

作者： Zain-ul-Abdin Ahlander, Anders Svensson, Bertil Halmstad Univ Ctr Res Embedded Syst CERES Halmstad Sweden Saab AB Gothenburg Sweden

ISBN: (纸本)9780769551173

The next generation radar systems have high performance demands on the signal processing chain. Examples include the advanced image creating sensor systems in which complex calculations are to be performed on huge sets of data in real time. Manycore architectures are gaining attention as a means to overcome the computational requirements of the complex radar signal processing by exploiting massive parallelism inherent in the algorithms in an energy efficient manner. In this paper, we evaluate a manycore architecture, namely a 16-core Epiphany processor, by implementing two significantly large case studies, viz. an autofocus criterion calculation and the fast factorized back-projection algorithm, both key components in modern synthetic aperture radar systems. The implementation results from the two case studies are compared on the basis of achieved performance and programmability. One of the Epiphany implementations demonstrates the usefulness of the architecture for the streaming based algorithm (the autofocus criterion calculation) by achieving a speedup of 8.9x over a sequential implementation on a state-of-the-art general-purpose processor of a later silicon technology generation and operating at a 2.7x higher clock speed. On the other case study, a highly memory-intensive algorithm (fast factorized backprojection), the Epiphany architecture shows a speedup of 4.25x. For embedded signal processing, low power dissipation is equally important as computational performance. In our case studies, the Epiphany implementations of the two algorithms are, respectively, 78x and 38x more energy efficient.

关键词： Manycore architecture Parallel programming Radar signal processing

来源：评论

学校读者我要写书评

暂无评论

Energy-Efficient Synthetic-Aperture Radar processing on a Manycore Architecture

Energy-Efficient Synthetic-Aperture Radar Processing on a Ma...

引用

International Conference on Parallel processing (ICPP)

作者： Zain-Ul-Abdin Anders Åhlander Bertil Svensson Centre for Research on Embedded Systems (CERES) Halmstad University Halmstad Sweden Saab AB Gothenburg Sweden

The next generation radar systems have high performance demands on the signal processing chain. Examples include the advanced image creating sensor systems in which complex calculations are to be performed on huge sets of data in real time. Many core architectures are gaining attention as a means to overcome the computational requirements of the complex radar signal processing by exploiting massive parallelism inherent in the algorithms in an energy efficient manner. In this paper, we evaluate a many core architecture, namely a 16-core Epiphany processor, by implementing two significantly large case studies, viz. an auto focus criterion calculation and the fast factorized back-projection algorithm, both key components in modern synthetic aperture radar systems. The implementation results from the two case studies are compared on the basis of achieved performance and programmability. One of the Epiphany implementations demonstrates the usefulness of the architecture for the streaming based algorithm (the auto focus criterion calculation) by achieving a speedup of 8.9x over a sequential implementation on a state-of-the-art general-purpose processor of a later silicon technology generation and operating at a 2.7x higher clock speed. On the other case study, a highly memory-intensive algorithm (fast factorized back projection), the Epiphany architecture shows a speedup of 4.25x. For embedded signal processing, low power dissipation is equally important as computational performance. In our case studies, the Epiphany implementations of the two algorithms are, respectively, 78x and 38x more energy efficient.

关键词： signal processing algorithms Computer architecture Synthetic aperture radar Parallel processing Radar imaging Image resolution

来源：评论

学校读者我要写书评

暂无评论

Efficient implementations for AES Encryption and Decryption

引用

CIRCUITS SYSTEMS AND signal processing 2012年第5期31卷 1765-1785页

作者： Rachh, Rashmi Ramesh Mohan, P. V. Ananda Anami, B. S. ECIL R&D Bangalore 560052 Karnataka India KLE Soc Coll Engn & Technol Dept Comp Sci Belgaum 590008 India KLE Inst Technol Hubli India

This paper proposes two efficient architectures for hardware implementation of the advanced Encryption Standard (AES) algorithm. The composite field arithmetic for implementing SubBytes (S-box) and InvSubBytes (Inverse S-box) transformations investigated by several authors is used as the basis for deriving the proposed architectures. The first architecture for encryption is based on optimized S-box followed by bit-wise implementation of MixColumns and AddRoundKey and optimized Inverse S-box followed by bit-wise implementation of InvMixColumns and AddMixRoundKey for decryption. The proposed S-box and Inverse S-box used in this architecture are designed as a cascade of three blocks. In the second proposed architecture, the block iii of the proposed S-box is combined with the MixColumns and AddRoundKey transformations forming an integrated unit for encryption. An integrated unit for decryption combining the block iii of the proposed InvSubBytes with InvMixColumns and AddMixRoundKey is formed on similar lines. The delays of the proposed architectures for VLSI implementation are found to be the shortest compared to the state-of-the-art implementations of AES operating in non-feedback mode. Iterative and fully unrolled sub-pipelined designs including key schedule are implemented using FPGA and ASIC. The proposed designs are efficient in terms of Kgates/Giga-bits per second ratio compared with few recent state-of-the-art ASIC (0.18-mu m CMOS standard cell) based designs and throughput per area (TPA) for FPGA implementations.

关键词： advanced Encryption Standard Encryption Decryption FPGA implementation VLSI architectures

来源：评论

学校读者我要写书评

暂无评论

Correctly rounded floating-point division for DSP-enabled FPGAs

Correctly rounded floating-point division for DSP-enabled FP...

引用

International Conference on Field Programmable Logic and Applications

作者： Bogdan Pasca Altera European Technology Centre High Wycombe UK

Floating-point division is a very costly operation in FPGA designs. High-frequency implementations of the classic digit-recurrence algorithms for division have long latencies (of the order of the number fraction bits) and consume large amounts of logic. Additionally, these implementations require important routing resources, making timing closure difficult in complete designs. In this paper we present two multiplier-based architectures for division which make efficient use of the DSP resources in recent Altera FPGAs. By balancing resource usage between logic, memory and DSP blocks, the presented architectures maintain high frequencies is full designs. Additionally, compared to classical algorithms, the proposed architectures have significantly lower latencies. The architectures target faithfully rounded results, similar to most elementary functions implementations for FPGAs but can also be transformed into correctly rounded architectures with a small overhead. The presented architectures are built using the Altera DSP Builder advanced framework and will be part of the default blockset.

关键词： Digital signal processing Polynomials Field programmable gate arrays Approximation error Memory management

来源：评论

学校读者我要写书评

暂无评论

ECG signal processing, Classification and Interpretation: A Comprehensive Framework of Computational Intelligence

引用

2011年

作者： Adam Gacek and Witold Pedrycz

ISBN: (纸本)0857298674

The book shows how the various paradigms of computational intelligence, employed either singly or in combination, can produce an effective structure for obtaining often vital information from ECG signals. The text is self-contained, addressing concepts, methodology, algorithms, and case studies and applications, providing the reader with the necessary background augmented with step-by-step explanation of the more advanced concepts. It is structured in three parts: Part I covers the fundamental ideas of computational intelligence together with the relevant principles of data acquisition, morphology and use in diagnosis; Part II deals with techniques and models of computational intelligence that are suitable for signal processing; andPart iii details ECG system-diagnostic interpretation and knowledge acquisition architectures. Illustrative material includes: brief numerical experiments; detailed schemes, exercises and more advanced problems.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Algorithm design for efficient implementation of coding and signal processing systems

Algorithm design for efficient implementation of coding and ...

引用

作者： Dai, Yongmei Lehigh University

学位级别：Ph.D.

advanced error control coding and signal processing techniques find wide applications in various communication systems, such as magnetic recording channels, fiber optical channels, wireline and wireless communication systems. Low-density parity-check (LDPC) codes and multiple-multiple-output (MIMO) technology have been receiving a lot of attention, since they greatly increase the capacity and improve the performance of future communication systems. In this dissertation, we focus on designing algorithms that enable efficient hardware implementations of LDPC codes and MIMO detection systems. Quasi-cyclic (QC) LDPC codes are of great interest since their regular code structure leads to efficient hardware implementations. We propose and implement in FPGA two partly parallel decoder architectures for QC LDPC codes to improve the decoding throughput and memory requirement of existing decoders. Our over-lapped message passing (OMP) decoder achieves the maximum throughput gain and hardware utilization efficiency (HUE) due to overlapping, hence has higher throughput and HUE than previously proposed OMP decoders while maintaining the same hardware requirements and the same error performance. We also show that the maximum throughput gain and HUE achieved by our OMP decoder are ultimately determined by the given code. Thus, we propose a coset-based construction method, which results in QC LDPC codes that allow our optimal OMP decoder to achieve higher throughput and HUE. To further reduce the memory requirement of our OMP decoder, we propose the parallel turbo-sum-product (PTSP) decoder architecture. Implementation results show that our PTSP decoder achieves better error performance, faster convergence and hence higher throughput than the OMP decoder with reduced memory requirement. Hardware implementations of tree search based MIMO detection often have limited performance due to large memory requirement or high computational complexity of sophisticated MIMO detection algorithm

关键词：

来源：评论

学校读者我要写书评

暂无评论

Digital media processing : DSP algorithms using C /

引用

2010年

作者： Malepati Hazarathaiah.

来源：内蒙古大学图书馆图书评论

学校读者我要写书评

暂无评论

Efficient arithmetic sum-of-product (SOP) based Multiple Constant Multiplication (MCM) for FFT 10

Efficient arithmetic sum-of-product (SOP) based Multiple Con...

引用

IEEE International Conference on Computer-Aided Design

作者： Vinay Karkala Joseph Wanstrath Travis Lacour Sunil P. Khatri Department of ECE Texas A and M University College Station TX USA Department of ECE Rose Hulman Institute of Technology IN USA

ISBN: (纸本)9781424481927

In this paper, we present an arithmetic sum-of-products (SOP) based realization of the general Multiple Constant Multiplication (MCM) algorithm. We also propose an enhanced SOP based algorithm, which uses Partial Max-SAT (PMSAT) to further optimize the SOP. The enhanced algorithm attempts to reduce the number of rows (partial products) of the SOP, by i) shifting coefficients to realize other coefficients when possible, ii) exploring multiple implementations of each coefficient using a Minimal Signed Digit (MSD) format and iii) exploiting the mutual exclusiveness within certain groups of partial products. Hardware implementations of the Fast Fourier Transform (FFT) algorithm require the incoming data to be multiplied by one of several constant coefficients. We test/validate it for FFT, which is an important problem. We compare our SOP-based architectures with the best existing implementation of MCM for FFT (which utilizes a cascade of adders), and show that our approaches show a significant improvement in area and delay. Our architecture was synthesized using 65nm technology libraries.

关键词： Delay Adders signal processing algorithms Optimization Silicon Hardware Fast Fourier transforms

来源：评论

学校读者我要写书评

暂无评论

advanced signal processing algorithms, architectures, and implementations XViii

Advanced Signal Processing Algorithms, Architectures, and Im...

引用

advanced signal processing algorithms, architectures, and implementations XViii

ISBN: (纸本)9780819472946

The proceedings contain 29 papers. The topics discussed include: GPU implementations for fast factorizations of STAP covariance matrices;accelerating nonuniform fast Fourier transform via reduction in memory access latency;fast computation of local correlation coefficients;object tracking in omni-directional mosaic;3D object matching on the GPU using spin-image surface matching;a sharpness metric implementation for image processing applications with feedback;superresolution imaging: a survey of current techniques;analytical approximations of translational subpixel shifts in signal and image registrations;simultaneous position and number of source estimates using random set theory;decision fusion in sensor networks for spectrum sensing based on likelihood ratio tests;and energy optimization for upstream data transfer in 802.15.4 beacon-enabled star formulation.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Analytical approximations of translational subpixel shifts in signal and image registrations

Analytical approximations of translational subpixel shifts i...

引用

Conference on advanced signal processing algorithms, architectures, and Implementantions XViii

作者： Zhang, Qiang Wake Forest Univ Hlth Sci Biostat Sci Dept Winston Salem NC 27157 USA

ISBN: (纸本)9780819472946

Analytical approximations of translational subpixel shifts in both signal and image registrations are derived by setting the derivatives of a normalized cross correlation function to zero and solving them. Without the need of iterative searching, this methods achieves a complexity of only O(mn), given an image size of m x n. Without the need to upsample, computation memory is also saved. Tests using simulated signals and images show good results.

关键词： Subpixel image registration image superresolution

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：