检索结果-内蒙古大学图书馆

Performance-energy trade-offs of deep learning convolution algorithms on ARM processors

JOURNAL OF SUPERCOMPUTING 2023年第9期79卷 9819-9836页

作者： Dolz, Manuel F. Barrachina, Sergio Martinez, Hector Castello, Adrian Macia, Antonio Fabregat, German Tomas, Andres E. Univ Jaume 1 Castellon de La Plana Spain Univ Cordoba Cordoba Spain Univ Alacant Alacant Spain Univ Politecn Valencia Valencia Spain

In this work, we assess the performance and energy efficiency of high-performance codes for the convolution operator, based on the direct, explicit/implicit lowering and Winograd algorithms used for deep learning (DL) inference on a series of ARM-based processor architectures. Specifically, we evaluate the NVIDIA Denver2 and Carmel processors, as well as the ARM Cortex-A57 and Cortex-A78AE CPUs as part of a recent set of NVIDIA Jetson platforms. The performance-energy evaluation is carried out using the ResNet-50 v1.5 convolutional neural network (CNN) on varying configurations of convolution algorithms, number of threads/cores, and operating frequencies on the tested processor cores. The results demonstrate that the best throughput is obtained on all platforms with the Winograd convolution operator running on all the cores at their highest frequency. However, if the goal is to reduce the energy footprint, there is no rule of thumb for the optimal configuration.

关键词： convolution algorithms ARM processors High performance Energy efficiency

来源：评论

学校读者我要写书评

暂无评论

The Method of a Fast Electrothermal Transient Analysis of Single-Inductance DC-DC Converters

引用

IEEE TRANSACTIONS ON POWER ELECTRONICS 2012年第9期27卷 4005-4012页

作者： Gorecki, Krzysztof Zarebski, Janusz Gdynia Maritime Univ Dept Marine Elect PL-81225 Gdynia Poland

In this paper, a new method for fast estimation of the characteristics of single-inductance dc-dc converters at the steady state with self-heating taken into account is proposed. This method is based on the special memoryless convolution algorithm. The method is described in detail. The theoretical considerations are illustrated with simulation results of buck and boost converters.

关键词： convolution algorithms dc-dc converters electrothermal analysis

来源：评论

学校读者我要写书评

暂无评论

Properties of some convolution algorithms for the thermal analysis of semiconductor devices

引用

APPLIED MATHEMATICAL MODELLING 2007年第8期31卷 1489-1496页

作者： Zarebski, Janusz Gorecki, Krzysztof Gdynia Maritime Univ Dept Marine Elect PL-81225 Gdynia Poland

This paper deals with three kinds of convolution algorithms designed for the thermal analysis of semiconductor devices and electronic circuits with the use of the lumped thermal models. Such kind of models and algorithms is especially convenient to analyse the electronic circuits consisting of a high number of thermally sensitive devices. The fundamental features, such as: stability, convergence and accuracy of these algorithms, are considered and investigated in the paper. In the investigations the exponential test function describing the dissipated power is taken into account. It was analytically proved that the considered algorithms are stable and convergent. The analytical formulas describing the values of the local and total cut off error of the algorithms are proposed. The theoretical considerations are accompanied by some calculation results illustrating the influence of the values of thermal parameter models and the size of the analysis step on the accuracy of calculations carried out with the considered algorithms. (c) 2006 Elsevier Inc. All rights reserved.

关键词： convolution algorithms stability convergence thermal analysis

来源：评论

学校读者我要写书评

暂无评论

Basic dosimetric verification in water of the anisotropic analytical algorithm for Varian, Elekta and Siemens linacs

引用

ZEITSCHRIFT FUR MEDIZINISCHE PHYSIK 2008年第2期18卷 128-135页

作者： Cozzi, Luca Nicolini, Giorgia Vanetti, Eugenio Clivio, Alessandro Glashoerster, Marco Schiefer, Hans Fogliata, Antonella Oncol Inst So Switzerland Dept Radiat Oncol Med Phys Unit CH-6504 Bellinzona Switzerland Univ Munster Dept Radiat Oncol Munster Germany Kantonsspital St Gallen Klin Radioonkol St Gallen Switzerland Univ Lausanne Fac Med Lausanne Switzerland

Since early 2007 a new version of the Anisotropic Analytical Algorithm (AAA) for photon dose calculations was released by Varian Medical Systems for clinical usage on Elekta linacs and also, with some restrictions, for Siemens linaes. Basic validation studies were peformed and reported for three beams: 4,6 and 15 MV for an Elekta Synergy, 6 and 15 MV for a Siemens Primus and, as a reference, for 6 and 15 MV from a Varian Clinac 2100C/D. Generally AAA calculations reproduced well measured data and small deviations were observed for open and wedged fields. PDD curves showed in average differences between calculation and measurement smaller than 1% or 1.2 mm for Elekta beams, 1% or 1.8 mm for Siemens beams and 1% or I mm for Varian beams. Profiles in the flattened region matched measurements with deviations smaller than 1% for Elekta and Varian beams, 2% for Siemens. Percentage differences in Output Factors were observed as small as 1% in average.

关键词： convolution algorithms anisotropic analytical algorithm treatment planning systems

来源：评论

学校读者我要写书评

暂无评论

Automatic derivation and implementation of fast convolution algorithms

引用

JOURNAL OF SYMBOLIC COMPUTATION 2004年第2期37卷 261-293页

作者： Johnson, JR Breitzman, AF Drexel Univ Dept Comp Sci Philadelphia PA 19104 USA

This paper surveys algorithms for computing linear and cyclic convolution. algorithms are presented in a uniform mathematical notation that allows automatic derivation, optimization, and implementation. Using the tensor product and Chinese remainder theorem, a space of algorithms is defined and the task of finding the best algorithm is turned into an optimization problem over this space of algorithms. This formulation led to the discovery of new algorithms with reduced operation count. Symbolic tools are presented for deriving and implementing algorithms. (C) 2003 Elsevier Ltd. All rights reserved.

关键词： cyclic convolution convolution algorithms

来源：评论

学校读者我要写书评

暂无评论

TELETRAFFIC ENGINEERING FOR PRODUCT-FORM CIRCUIT-SWITCHED NETWORKS

引用

ADVANCES IN APPLIED PROBABILITY 1990年第3期22卷 657-675页

作者： ROSS, KW TSANG, D University of Pennsylvania

We develop a performance modeling methodology for product-form circuit-switched networks. These networks allow for: arbitrary topology and link capacities; Poisson and finite population arrivals; multiple classes of calls, each class with a different route and bandwidth requirement; conference as well as point-to-point calls. The methodology is first applied to generalized tree networks, which consist of multiple access links feeding into a common link. Each access link may support multiple ‘long-distance' classes (requiring circuits only on the access link and on the common link) and multiple ‘local' classes (requiring circuits only on the access link). For generalized tree networks an efficient algorithm is given to determine the blocking probabilities. The methodology is then applied to hierarchical tree networks, where traffic is repeatedly merged in the direction of a root *** also establish a ‘Norton' theorem for product-form circuit-switched networks. This theorem implies that for any given calling class, the entire network can be replaced by an Erlang loss system with a state-dependent arrival rate, without modifying the equilibrium probabilities for the particular calling class.

关键词： TELEPHONE NETWORKS REVERSIBILITY convolution algorithms NORTON'S EQUIVALENT

来源：评论

学校读者我要写书评

暂无评论

convolution algorithms FOR SMALL-WORD-LENGTH DIGITAL-FILTERING APPLICATIONS

引用

IEE JOURNAL ON ELECTRONIC CIRCUITS AND SYSTEMS 1979年第6期3卷 253-256页

作者： REDDY, NS REDDY, VU Radar and Communication Centre Indian Institute of Technology Kharagpur India

Implementation of rectangular transforms (r.t.) in modular arithmetic and computation of number theoretic transforms through Winograd's algorithm are discussed. The computational effort of various algorithms to implement real convolution is investigated. Considering the signal/noise ratio performance and hardware complexity, it is shown that the r.t.s are best suited for digital-filtering applications with word lengths less than about 16 bits. Finally, r.t.s are shown to be the most amenable to the application of the Chinese remainder theorem for increasing the dynamic range

关键词： Chinese remainder theorem transforms rectangular transforms Winograd's algorithm real convolution signal/noise ratio performance Other numerical methods Digital arithmetic methods computational requirements digital arithmetic digital convolution convolution algorithms number theoretic transforms Digital filters digital filters hardware complexity small word length digital filtering modular arithmetic

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：