咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >An FPU design template to opti... 收藏

An FPU design template to optimize the accuracy-efficiency-area trade-off

作     者:Zoni, Davide Galimberti, Andrea Fornaciari, William 

作者机构:DEIB Politecn Milano I-20133 Milan Italy 

出 版 物:《SUSTAINABLE COMPUTING-INFORMATICS & SYSTEMS》 (可持续计算:信息与系统)

年 卷 期:2021年第29卷第PartA期

核心收录:

学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:Horizon 2020 Framework Programme  H2020  (801137) 

主  题:Floating Point Units (FPU) Accuracy-Cost-energy tradeoff Run-time optimization 

摘      要:Modern embedded systems are in charge of an increasing number of tasks that extensively employ floating-point (FP) computations. The ever-increasing efficiency requirement, coupled with the additional computational effort to perform FP computations, motivates several microarchitectural optimizations of the FPU. This manuscript presents a novel modular FPU microarchitecture, which targets modern embedded systems and considers heterogeneous workloads including both best-effort and accuracy-sensitive applications. The design optimizes the EDP-accuracy-area figure of merit by allowing, at design-time, to independently configure the precision of each FP operation, while the FP dynamic range is kept common to the entire FPU to deliver a simpler micro architecture. To ensure the correct execution of accuracy-sensitive applications, a novel compiler pass allows to substitute each FP operation for which a low-precision hardware support is offered with the corresponding soft float function call. The assessment considers seven FPU variants encompassing three different state-of-the-art designs. The results on several representative use cases show that the binary32 FPU implementation offers an EDP gain of 15%, while, in case the FPU implements a mix of binary32 and bfloat16 operations, the EDP gain is 19%, the reduction in the resource utilization is 21% and the average accuracy loss is less than 2.5%. Moreover, the resource utilization of our FPU variants is aligned with the one of the FPU employing state-of-the-art, highly specialized FP hardware accelerators. Starting from the assessment, a set of guidelines is drawn to steer the design of the FP hardware support in modern embedded systems.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分