检索结果-内蒙古大学图书馆

Resilient training of neural network classifiers with approximate computing techniques for hardware-optimised implementations

引用

IET COMPUTERS AND DIGITAL techniques 2019年第6期13卷 532-542页

作者： Torres, Vitor Ferreira Torres, Frank Sill Univ Fed Minas Gerais Grad Program Elect Engn Av Antonio Carlos 6627 BR-31270901 Belo Horizonte MG Brazil DFKI GmbH Cyber Phys Syst Bibliothekstr 5 D-28359 Bremen Germany

As Machine Learning applications increase the demand for optimised implementations in both embedded and high-end processing platforms, the industry and research community have been responding with different approaches to implement these solutions. This work presents approximations to arithmetic operations and mathematical functions that, associated with a customised adaptive artificial neural networks training method, based on RMSProp, provide reliable and efficient implementations of classifiers. The proposed solution does not rely on mixed operations with higher precision or complex rounding methods that are commonly applied. The intention of this work is not to find the optimal simplifications for specific deep learning problems but to present an optimised framework that can be used as reliably as one implemented with precise operations, standard training algorithms and the same network structures and hyper-parameters. By simplifying the 'half-precision' floating point format and approximating exponentiation and square root operations, the authors' work drastically reduces the field programmable gate array implementation complexity (e.g. -43 and -57% in two of the component resources). The reciprocal square root approximation is so simple it could be implemented only with combination logic. In a full software implementation for a mixed-precision platform, only two of the approximations compensate the processing overhead of precision conversions.

关键词： field programmable gate arrays learning (artificial intelligence) floating point arithmetic neural nets resilient training neural network classifiers approximate computing techniques hardware-optimised implementations high-end processing platforms research community arithmetic operations mathematical functions customised adaptive artificial neural networks training method mixed operations complex rounding methods optimal simplifications specific deep learning problems optimised framework precise operations standard training algorithms network structures half-precision floating point square root operations field programmable gate array implementation complexity reciprocal square root approximation software implementation mixed-precision platform precision conversions machine learning applications

来源：评论

学校读者我要写书评

暂无评论

approximate Successive Cancellation Decoder for Polar Codes

引用

DEFENCE SCIENCE JOURNAL 2025年第2期75卷 206-214页

作者： Nandini, Jali Pullakandam, Muralidhar Patri, Sreehari Rao Natl Inst Technol Warangal 506004 India

Polar codes are the forward error correcting (FEC) codes renowned for achieving channel capacity for various codeword lengths. A low-complexity decoder, termed a Successive Cancellation (SC) decoder, is commonly employed to decode polar codes. However, the SC decoder's sequential nature leads to a drawback in terms of decoding speed. This paper proposes an approximate successive cancellation decoder (ASCD), which incorporates approximate computing techniques that are equivalent alternatives to the exact computational units. The comparator, adder-subtractor block, is replaced by approximate units in the merged processing unit, and an approximate twobit processing unit is designed at the last stage of the decoder to reduce the hardware complexity and delay with negligible performance degradation. The overall design of the proposed ASCD is implemented targeting the Xilinx Virtex-6 FPGA platform. With the proposed approximate counterparts, the ASCD achieves an average throughput improvement of 68 % compared to the former decoders. In addition, the usage of overall hardware resources is reduced by 41 %, reducing the processing complexity. The proposed decoder proves beneficial for error-resilient applications in 5G wireless communications.

关键词： Forward error correcting codes Successive cancellation decoder approximate successive cancellation decoder Merged processing unit approximate computing techniques

来源：评论

学校读者我要写书评

暂无评论

A Methodology for Fault-tolerant Pareto-optimal approximate Designs of FPGA-based Accelerators

引用

ACM TRANSACTIONS ON EMBEDDED computing SYSTEMS 2023年第4期22卷 1-31页

作者： Tsounis, Ioannis Agiakatsikas, Dimitris Psarakis, Mihalis Univ Piraeus Karaoli & Dimitriou 80 Piraeus 18534 Greece

approximate computing techniques (ACTs) take advantage of resilience computing applications to trade off among output precision, area, power, and performance. ACTs can lead to significant gains at affordable costswhen efficiently implemented on Field Programmable Gate Array- (FPGA) based accelerators. Although several novel ACTs works have been proposed for FPGA accelerators, their applicability to high-assurance systems has not been explored as much. ACTs are becoming necessary in many critical Edge computing systems, such as self-driving cars and Earth observation satellites, to increase computational efficiency. However, an important question comes to mind when targeting critical systems: Does ACT optimization negatively affect the reliability of the system and how can one find optimal design architectures that blend classic mitigation techniques like Triple Modular Redundancy with approximation- and precise-based arithmetic hardware units to achieve the best possible computational efficiency without compromising dependability? This work aims to solve this research problem by introducing a Design Space Exploration (DSE) methodology that employs ACTs in arithmetic units of the design and identifies Pareto-optimal microarchitectures that balance all relevant gains of ACTs, such as area, speed, power, failure rate, and precision, by inserting the correct amount of approximation in the design. In a nutshell, our DSE methodology has formulated the DSE with a Multi-Objective Optimization Problem (MOP). Each Pareto-optimal solution of our tool finds which arithmetic units of the design to implement with precise and approximate circuits and which units to selectively triplicate to remove single points of failure that compromise system reliability below acceptable thresholds. We also suggest another formulation of the DSE into a Single-Objective constraint Optimization Problem (ScOP) producing a single optimal point, and that the user may demand, as a less time-consuming

关键词： approximate computing techniques approximate adders FPGAs Pareto-optimal fault tolerance JPEG DCT selective-TMR

来源：评论

学校读者我要写书评

暂无评论

A Genetic-algorithm-based Approach to the Design of DCT Hardware Accelerators

引用

ACM JOURNAL ON EMERGING TECHNOLOGIES IN computing SYSTEMS 2022年第3期18卷 50-50页

作者： Barbareschi, Mario Barone, Salvatore Bosio, Alberto Han, Jie Traiola, Marcello Univ Naples Federico II Dept Elect Engn & Informat Technol Naples Italy Univ Lyon CPE Lyon INSA Lyon ECLCNRSUCBLINLUMR5270 Lyon France Univ Alberta Dept Elect & Comp Engn Edmonton AB Canada Univ Rennes IRISA CNRS InriaUMR 6074 Rennes France

As modern applications demand an unprecedented level of computational resources, traditional computing system design paradigms are no longer adequate to guarantee significant performance enhancement at an affordable cost. approximate computing (AxC) has been introduced as a potential candidate to achieve better computational performances by relaxing non-critical functional system specifications. In this article, we propose a systematic and high-abstraction-level approach allowing the automatic generation of near Pareto-optimal approximate configurations for a Discrete Cosine Transform (DCT) hardware accelerator. We obtain the approximate variants by using approximate operations, having configurable approximation degree, rather than full-precise ones. We use a genetic searching algorithm to find the appropriate tuning of the approximation degree, leading to optimal tradeoffs between accuracy and gains. Finally, to evaluate the actual HW gains, we synthesize non-dominated approximate DCT variants for two different target technologies, namely, Field Programmable Gate Arrays (FPGAs) and Application Specific Integrated Circuits (ASICs). Experimental results show that the proposed approach allows performing a meaningful exploration of the design space to find the best tradeoffs in a reasonable time. Indeed, compared to the state-of-the-art work on approximate DCT, the proposed approach allows an 18% average energy improvement while providing at the same time image quality improvement.

关键词： Code mutation generic algorithm approximate computing techniques design space exploration JPEG discrete cosine transform

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：