检索结果-内蒙古大学图书馆

IEEE International Symposium on Circuits and Systems (ISCAS)

作者： T. Roska A. Rodriguez-Vazquez Computer & Automation Institute Hungarian Academy of Sciences (ATOMKI) Budapest Hungary Instituto de Microelectrónica de Sevilla-CNM-CSIC Edificio CICA-CNM Seville Spain

While in most application areas digital processors can solve problems initially, in some fields their capabilities are very limited. A typical example is vision. Simple animals outperform super-computers in the realization of basic vision tasks. In order to overcome the limitations of these conventional systems, a fundamentally different array architecture is needed. This architecture is based on the new paradigm of analogic cellular (CNN) computing whose most advanced implementation is the so-called CNN universal machine (CNN-UM). Its main components are: a) parallel architecture consisting of an array of locally-connected analog processors; b) a means of storing, locally, pixel-by-pixel, the intermediate computation results, and c) stored on-chip programmability. When implemented as a mixed-signal VLSI chip, the CNN-UM is capable of image processing at rates of trillions of operations per second with very small size and low power consumption. On the other hand, when integrating the adaptive multi-sensor array in the CNN-UM, the resulting sensor+computer array offers unprecedented capabilities. This paper reviews the latest results on CMN-UM chips and systems, and outlines the envisaged roadmap for these computers.

关键词： Cellular neural networks Sensor arrays Computer architecture Adaptive arrays Animals Turing machines Parallel architectures Analog computers Concurrent computing Very large scale integration

来源：评论

学校读者我要写书评

暂无评论

SIMPLIFIED FLOATING-POINT DIVISION AND SQUARE ROOT

SIMPLIFIED FLOATING-POINT DIVISION AND SQUARE ROOT

引用

IEEE International Conference on Acoustics, Speech, and signal processing

作者： Timo Viitanen Pekka Jaaskelainen Otto Esko Jarmo Takala Tampere University of Technology Department of Computer Systems

ISBN: (纸本)9781479903573

Digital signal processing (DSP) algorithms on low-power embedded platforms are often implemented using fixed-point arithmetic due to expected power and area savings over floating-point computation. However, recent research shows that floating-point arithmetic can be made competitive by using a reduced-precision format instead of, e.g., IEEE standard single precision, thereby avoiding the algorithm design and implementation difficulties associated with fixed-point arithmetic. This paper investigates the effects of simplified floating-point arithmetic applied to an FMA-based floating-point unit and the associated software division and square root operations. Software operations are proposed which attain near-exact precision with twice the performance of exact algorithms and resolve overflow-related errors with inexpensive exponent-manipulation special instructions.

关键词： Floating-point arithmetic Accelerator architectures Fused multiply-add Digital signal processing implementations Low-power design floating point arithmetic Accelerator architectures LOW POWER Algorithm design Fixed-point arithmetic Plazas Digital signal processing FLOATING POINTS

来源：评论

学校读者我要写书评

暂无评论

Hardware implementations of real‐time digital speech processing algorithms

引用

The Journal of the Acoustical Society of America 2005年第S1期78卷 S79-S79页

作者： Elliot Singer Lincoln Laboratory Massachusetts Institute of Technology Lexington MA 02173‐0073

Continuing advances in hardware technologies are permitting the realization of increasingly sophisticated speech processing algorithms in real‐time equipments. The availability of commercial digital signal processing integrated circuit components has been especially responsible for a reduction in the size and cost of these devices. This presentation will describe the unique requirements of speech compression and speech recognition algorithms with respect to arithmetic calculations, memory, and I/O. Representative equipment designs developed at Lincoln Laboratory for realizing real‐time speech processing algorithms will be described. These include: the Lincoln digital signal processor (LDSP), a programmable, general‐purpose ECL machine suited for real‐time evaluation of speech processing algorithms; the advanced linear predictive coding microprocessor (ALPCM), a flexible bit‐slice processor designed for use in operational environments; and the compact linear predictive coder, a small, narrow‐band vocoder terminal based on DSP microprocessors. The application of advanced VLSI technology to meet the processing demands of large vocabulary speech recognition will be discussed, with specific focus on an approach being pursued at Lincoln Laboratory which uses wafer scale integration and restructurable VLSI technology to exploit the high level of concurrency in the recognition algorithm. [Work sponsored by the Department of the Air Force.]

关键词：

来源：评论

学校读者我要写书评

暂无评论

針對FIR與FFT演算法於超大型積體電路實作上之解析式面積最佳化技術

針對FIR與FFT演算法於超大型積體電路實作上之解析式面積最佳化技...

引用

作者：林步青交通大學

学位级别：博士

在过去几十年中，随着通信系统的复杂度急剧增加，数位讯号处理演算法被广泛地采用，例如有限脉冲响应(FIR)滤波器和快速傅利叶转换(FFT)。其中，多常数乘法(MCM) 是在处理输入资料与常数的乘法时，使用一组加法器取代常规乘法器，其概... 详细信息

在过去几十年中，随着通信系统的复杂度急剧增加，数位讯号处理演算法被广泛地采用，例如有限脉冲响应(FIR)滤波器和快速傅利叶转换(FFT)。其中，多常数乘法(MCM) 是在处理输入资料与常数的乘法时，使用一组加法器取代常规乘法器，其概念更是广泛地被应用在有限脉冲响应滤波器的设计中。在过去，虽然已有很多降低加法器用量之演算法被提出以达到面绩缩小的目的，但是，它们并未考虑每个加法器的实际位元数，而这将会导致估计的硬体成本不够精确。因此这篇论文中，我们提出了一个保证位元数的多个常数乘法最佳化演算法，着重於最大限度地减少加法器的总位元数，而不是仅考虑减少加法器总数。首先，构建基於给定系数的子表达式图表，继而导出一组针对最小化加法器位元数之条件，最後使用整数线性规画得到最佳化的结果。实验结果显示，该演算法的确可以有效地减少所需的加法器位元数并且优於所有的现行技术。此外，快速傅利叶转换处理器在众多以数位讯号处理为基础的系统中是一个核心的元件；例如，现代无线通信中的正交分频多工(OFDM)。许多关键的设计参数，如架构，位元长度，和数字格式，都必须非常仔细地考虑。在过去的几十年，针对不同的设计目标，已经有很多最佳化的管线式快速傅利叶转换架构被提出。虽然固定的管线式快速傅利叶转换架构能在合理的硬体成本下提供不错的处理能力，但是，在针对需要大量处理能力的应用上，它可能仍然无法满足效能的需求。因此在这篇论文中，我们提出了一种可扩展的多路径延迟累积器式之快速傅利叶转换架构及其相应的硬体设计产生器，在给定的处理能力条件下，它能够迅速地产生对应的快速傅利叶转换核心。实验结果显示，此方法所产生之快速傅利叶转换器比现有的可折叠式多路径延迟累积器式快速傅利叶转换架构，面积更小且功率效率更高。除此之外，我们亦提出了一个快速傅利叶转换器最佳化的设计流程。在固定位元长度的条件下，正确的调整每一个蝴蝶级的定点表示之数值，以最大化输出级的信号量化杂讯比(SQNR)。所提出的流程采用机率分布模型来模拟每个阶段的输出信号的机率行为。由於量化和饱和运算所导致的杂讯可以静态分析，以了解在进行缩放决策时的影响。因此，不需耗费时间的模拟分析，我们所提出的方法即可有效地决定每一个蝴蝶级的最适当的数字格式，从而最佳化整个输出级的信号量化杂讯比。此外，建议的流程能够处理各种快速傅利叶转换点数、快速傅利叶转换演算法、字元长度、以及输入信号的机率分布。实验结果显示，我们的方法可以在8192点且以2为基数的快速傅利叶转器处理器中节省3位元的字元长度，而且不会对输出级的信号量化杂讯比造成影响。使用我提出的静态尺规最佳化的技术所创建的快速傅利叶转器处理器的信号量化杂讯比可以近似於一个配有额外的动态尺规化方法，但不需要其额外庞大的硬体成本。

关键词：數位訊號處理快速傅利葉轉換有限脈衝響應濾波嗎多常數乘法定點數

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：