检索结果-内蒙古大学图书馆

Optimized Custom Precision Function Evaluation for Embedded Processors

IEEE TRANSACTIONS ON COMPUTERS 2009年第1期58卷 46-59页

作者： Lee, Dong-U Villasenor, John D. Univ Calif Los Angeles Dept Elect Engn Los Angeles CA 90095 USA

Fixed-point processors are utilized in an enormous variety of applications, often for tasks that require the evaluation of mathematical functions. We present an automated method for mapping functions to such processors via polynomials that explicitly targets the native word length of the processor, thereby significantly reducing the execution time relative to commonly used floating-point emulation approaches based on traditional mathematical libraries. The methods presented here also contrast with hand-tuned processor-specific code, which has the potential to deliver efficient implementations but at the cost of significant design time. We describe an automated design flow utilizing multiword arithmetic to provide overflow protection and precision accurate to one unit in the last place (ulp). Analytical approaches are used to minimize the number of fixed-width operands required for each operation and to ensure that precision requirements are met. This allows automated generation of processor-optimized code and characterization of a design space representing a rich range of trade-offs among precision, latency, and memory cost.

关键词： Computer arithmetic elementary function approximation minimax approximation and algorithms processors real-time and embedded systems simulated annealing spline and piecewise polynomial approximation

来源：评论

学校读者我要写书评

暂无评论

A bit-width optimization methodology for polynomial-based function evaluation

引用

IEEE TRANSACTIONS ON COMPUTERS 2007年第4期56卷 567-571页

作者： Lee, Dong-U Villasenor, John D. Univ Calif Los Angeles Dept Elect Engn Los Angeles CA 90095 USA

We present an automated bit-width optimization methodology for polynomial-based hardware function evaluation. Due to the analytical nature of the approach, overflow protection and precision accurate to one unit in the last place (ulp) can be guaranteed. A range analysis technique based on computing the root of the derivative of a signal is utilized to determine the minimal number of integer bits. Fractional bit requirements are established using an analytical error expression derived from the functions that occur along the data path. Global fractional bit optimization across multiple computation stages is performed using simulated annealing and circuit area estimation functions.

关键词： computer arithmetic elementary function approximation field programmable gate arrays finite wordlength effects minimax approximation and algorithms

来源：评论

学校读者我要写书评

暂无评论

A hardware Gaussian noise generator using the Box-Muller method and its error analysis

引用

IEEE TRANSACTIONS ON COMPUTERS 2006年第6期55卷 659-671页

作者： Lee, DU Villasenor, JD Luk, W Leong, PHW Univ Calif Los Angeles Dept Elect Engn Los Angeles CA 90095 USA Univ London Imperial Coll Sci Technol & Med Dept Comp London SW7 2AZ England Chinese Univ Hong Kong Dept Comp Sci & Engn Shatin Hong Kong Peoples R China

We present a hardware Gaussian noise generator based on the Box-Muller method that provides highly accurate noise samples. The noise generator can be used as a key component in a hardware-based simulation system, such as for exploring channel code behavior at very low bit error rates, as low as 10(-12) to 10(-13). The main novelties of this work are accurate analytical error analysis and bit-width optimization for the elementary functions involved in the Box-Muller method. Two 16-bit noise samples are generated every clock cycle and, due to the accurate error analysis, every sample is analytically guaranteed to be accurate to one unit in the last place. An implementation on a Xilinx Virtex-4 XC4VLX100-12 FPGA occupies 1,452 slices, three block RAMs, and 12 DSP slices, and is capable of generating 750 million samples per second at a clock speed of 375 MHz. The performance can be improved by exploiting concurrent execution: 37 parallel instances of the noise generator at 95 MHz on a Xilinx Virtex-II Pro XC2VP100-7 FPGA generate seven billion samples per second and can run over 200 times faster than the output produced by software running on an Intel Pentium-4 3 GHz PC. The noise generator is currently being used at the Jet Propulsion Laboratory, NASA to evaluate the performance of low-density parity-check codes for deep-space communications.

关键词： algorithms implemented in hardware computer arithmetic error analysis elementary function approximation field programmable gate arrays minimax approximation and algorithms optimization random number generation simulation

来源：评论

学校读者我要写书评

暂无评论

Optimizing hardware function evaluation

引用

IEEE TRANSACTIONS ON COMPUTERS 2005年第12期54卷 1520-1531页

作者： Lee, DU Gaffar, AA Mencer, O Luk, W Univ Calif Los Angeles Dept Elect Engn Los Angeles CA 90024 USA Univ London Imperial Coll Sci Technol & Med Dept Elect & Elect Engn London SW7 2BT England Univ London Imperial Coll Sci Technol & Med Dept Comp London SW7 2BZ England

We present a methodology and an automated system for function evaluation unit generation. Our system selects the best function evaluation hardware for a given function, accuracy requirements, technology mapping, and optimization metrics, such as area, throughput, and latency. Function evaluation f(x) typically consists of range reduction and the actual evaluation on a small convenient interval such as [0, pi/2) for sin(x). We investigate the impact of hardware function evaluation with range reduction for a given range and precision of x and f(x) on area and speed. An automated bit-width optimization technique for minimizing the sizes of the operators in the data paths is also proposed. We explore a vast design space for fixed-point sin(x), log(x), and root x p accurate to one unit in the last place using MATLAB and ASC, A Stream Compiler for Field-Programmable Gate Arrays (FPGAs). In this study, we implement over 2,000 placed-and-routed FPGA designs, resulting in over 100 million Application-Specific Integrated Circuit (ASIC) equivalent gates. We provide optimal function evaluation results for range and precision combinations between 8 and 48 bits.

关键词： computer arithmetic elementary function approximation gate arrays minimax approximation and algorithms optimization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：