检索结果-内蒙古大学图书馆

Proceedings of the 15th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies

作者： Mika Bröker Johannes Menzel Christian Plessl Paderborn Center for Parallel Computing Paderborn University Paderborn Germany

来源：评论

学校读者我要写书评

暂无评论

FINN-HPC: Closing the Gap for Energy-Efficient Neural Network Inference on FPGAs in HPC 25

FINN-HPC: Closing the Gap for Energy-Efficient Neural Networ...

引用

Proceedings of the 15th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies

作者： Linus Jungemann Bjarne Wintermann Heinrich Riebler Christian Plessl Paderborn University Paderborn Center for Parallel Computing Paderborn Germany

来源：评论

学校读者我要写书评

暂无评论

Neural Network Inference in High-Performance computing: Closing the Gap for FINN based Reconfigurable Accelerators 25

Neural Network Inference in High-Performance Computing: Clos...

引用

Proceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays

作者： Linus Jungemann Bjarne Wintermann Heinrich Riebler Christian Plessl Paderborn University Paderborn Center for Parallel Computing Paderborn Germany

ISBN: (纸本)9798400713965

In recent years, Neural Networks (NNs) have become one of the most prevailing topics in computers science, both in research and in industry. NNs are used for data analysis, natural language processing, autonomous driving and more. As such, NNs also see more application and use in High-Performance computing (HPC). At the same time, energy efficiency has become an increasingly critical topic. NNs use large amounts of energy for operation, which in return results in large amounts of CO2 emissions. This work presents a comprehensive evaluation of current NN inference soft- and hardware configurations within High-Performance computing (HPC) environments, with a focus on both performance metrics and energy consumption. NN quantization and accelerators such as FPGAs allow for an increased inference efficiency, both in terms of throughput and energy. Therefore, this work focuses on FINN, an efficient NN inference framework for FPGAs, highlighting its current lack of support for HPC systems. We provide an in-depth analysis of FINN in order to implement extensions to optimize the end-to-end execution for the usage in the HPC environment. We thoroughly evaluate the performance and energy efficiency gains using newly implemented optimizations and compare it against existing NN accelerators for HPC. With our extensions of FINN, we were able to achieve a 1847× higher throughput, while also decreasing the latency on average by 0.9978× and EDP by 0.9979× on an Alveo U55C FPGA. Data flow based NN inference accelerators on an FPGA should be used if the performance and energy footprint of the inference process is crucial, and the batch sizes are small to medium. For extremely large batch sizes and a very limited time for network-to-accelerator (less than a few days), using GPUs is still the way to go. Our results show that with the newly developed driver, we outperform a high-end Nvidia A100 GPU by up to 7.81x in throughput, while having a 0.87x lower latency and 0.88x lower energy de

关键词： dataflow architectures

来源：评论

学校读者我要写书评

暂无评论

MARS - A framework for minimizing the job execution time in a metacomputing environment

引用

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID computing AND ESCIENCE 1996年第1期12卷 87-99页

作者： Gehring, J Reinefeld, A Paderborn Center for Parallel Computing Fürstenallee 11 D-33095 Paderborn Germany

Utilizing a collection of workstations and supercomputers in a metacomputing environment does not only offer an enormous amount of computing power, but also raises new problems. The true potential of WAN-based distributed computing can only be exploited if the application-to-architecture mapping reflects the different processor speeds, network performances and the application's communication characteristics. In this paper, we present the Metacomputer Adaptive Runtime System (MARS), a framework for minimizing the execution time of distributed applications on a WAN metacomputer. Work-load balancing and task migration is based on dynamic information on the processor load and network performance. Moreover, MARS uses accumulated statistical data on previous execution runs of the same application to derive an improved task-to-process mapping. Migration decisions are based on: (1) the current system load;(2) the network load;and (3) previously obtained application-specific characteristics. Our current implementation supports C applications with MPI message passing calls, but the general framework is also applicable to other programing environments like PVM, PARMACS and Express.

关键词： metacomputing dynamic load balancing job migration networked computing platforms message passing portability MPI

来源：评论

学校读者我要写书评

暂无评论

BUILDING A VIRTUAL MACHINE-ROOM - A FOCAL POINT IN METAcomputing

引用

FUTURE GENERATION COMPUTER SYSTEMS 1995年第4-5期11卷 477-489页

作者： RAMME, F Paderbom Center for Parallel Computing University of Paderborn Warburger Str. 100 33095 Paderbom Germany

computing resources which are transparently available to the user via networked environments are commonly called a metacomputer. In this sense, a metacomputer is a network of heterogeneous, computational resources linked by software in such a way that they can be used as easily as a single computational unit. During the last few years our work has been concentrated on developing methods and tools to provide a transparent and vendor independent hardware management system to the users. Solving this problem up to a high abstraction level will bring the idea of metacomputing a large step closer to its fruition. After reviewing the metacomputing approaches in Europe and the States, we will break down the task force into almost independent units. One of these, the resource access and allocation problem, the project computing center Software was focused on. This paper takes a closer look at CCS. Its underlying model which uses abstract views for specifying system components and the general purpose Resource Description Language will be sketched. We will explain how it is possible to support Wide-Area Network access and unstable connection lines. Afterwards, we will present the system and vendor independent batch processing facility usable for arbitrary programming environments. On-going activities and an enhancement of the CCS methodology to solve a core problem in wide-area metacomputing will conclude this paper.

关键词： DISTRIBUTED parallel computing RESOURCE AND ACCESS MANAGEMENT METAcomputing HETEROGENEOUS SYSTEM INTEGRATION

来源：评论

学校读者我要写书评

暂无评论

The GOmputer: Accelerating GO with FPGAs

The GOmputer: Accelerating GO with FPGAs

引用

2008 International Conference on Engineering of Reconfigurable Systems and Algorithms, ERSA 2008

作者： Platzner, Marco Döhre, Sven Happe, Markus Kenter, Tobias Lorenz, Ulf Schumacher, Tobias Send, Andre Warkentin, Alexander Paderborn Center for Parallel Computing University of Paderborn Germany

ISBN: (纸本)1601320647

GO is a very popular board game, especially in the Asian world. In contrast to chess programs that are able to compete with human top players, GO programs are still rather weak. Game theory classifies GO and chess as deterministic two-person zero-sum games with perfect information, which allows to address them with game tree search techniques such as the α/β algorithm. In principle, these games can be solved exactly. Practically, the high number of possible moves and the depth of the search tree prohibit exact solutions and require us to resort to a partial analysis of the search tree, leading to runtime consuming heuristic position evaluations. This paper presents the GOmputer project which aims at accelerating GO through aggressively parallelized game tree search combined with FPGA-based position evaluation. We first briefly discuss the algorithmic approach for playing GO, and then focus on FPGA accelerators for several position evaluation functions. The game board is mapped as a cellular automa on directly into hardware;position evaluation functions are turned into cellular algorithms. We show the hardware implementation of several functions and report on the achieved speedups. Finally, we discuss the current state of the GOmputer project.

关键词： Field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

Risk-aware migrations for prepossessing SLAs

Risk-aware migrations for prepossessing SLAs

引用

International Conference on Networking and Services 2006, ICNS'06

作者： Voss, Kerstin Paderborn University Paderborn Center for Parallel Computing Germany

ISBN: (纸本)0769526225

SLAs were developed in order to guarantee the customer's desired Quality of Service. To prepossess SLAs even in the case of system failures, migrating the job to an alternative resource is a well-known fault-tolerance mechanism. In this paper we start to consider migrations in a risk-aware concept. We plan to introduce risk assessment and management technologies into the Grid fabric in order to ensure prepossessing SLAs. The most benefits are seen in a risk-aware scheduling and initiating precautionary fault-tolerance mechanism. This paper focusses on precautionary migrations which should prevent an SLA violation. A motivating scenario presents the variety of required actions in a system with high workload for several migration alternatives. The important aspects of jobs and resources are explained. Furthermore, we present a measurement to estimate the effects of migrating to an alternative resource. This will be one decision criteria in the migration process. Future work will complete the risk-aware scheduling of migrations. © 2006 IEEE.

关键词： Grid computing

来源：评论

学校读者我要写书评

暂无评论

Enhance self-managing grids by risk management

Enhance self-managing grids by risk management

引用

3rd International Conference on Networking and Services, ICNS 2007

作者： Voss, Kerstin Paderborn Center for Parallel Computing Paderborn University Germany

Risk management (RM) processes are used in various application fields since often possible threats should be identified, evaluated, and avoided. In the Grid resource failures are common and likely threats which slow down the establishment of Service Level Agreements (SLAs). Introducing RM into the self-managing Grid is beneficial to estimate and react on such threats. The assessed probabilities for resource failures can be used as a decisive factor in different scenarios. In particular, the resource allocation profits from risk information since jobs can be mapped under consideration of their importance and the stability of available resources. Each RM process is particularly developed for one application field since the threats, the consequences, and the retaliatory actions are individual. The difference between conventional RM processes and the Grid integrated one is that the Grid should be self-managing. This implies that after the configuration RM processes have to be performed totally automatical. This paper presents the Grid RM process which bases on the standard for RM processes developed from the Federation of European Risk Management Associations (FERMA). © 2007 IEEE.

关键词： Risk management

来源：评论

学校读者我要写书评

暂无评论

Scalable P2P based RDF querying 06

Scalable P2P based RDF querying

引用

1st International Conference on Scalable Information Systems, InfoScale '06

作者： Heine, Felix Paderborn Center for Parallel Computing Paderborn University Germany

ISBN: (纸本)1595934286

In large-scale distributed systems, information is typically generated decentralized. However, for many applications it is desirable to have a unified view on this knowledge, allowing to query it without regarding the heterogeneity of the underlying systems. In this context, two main requirements have to be fulfilled. On the one hand, we need a flexible knowledge representation, and on the other hand the underlying infrastructure and query evaluation algorithm has to be highly scalable. The combination of p2p networks as basic infrastructure with RDF as a knowledge representation is a promising approach to this problem. Within this paper, we focus on the evaluation of RDF queries with respect to RDF data stored in a DHT-based p2p network. We propose a query algorithm and research different optimizations based on a look-ahead technique and Bloom filters which aim at maximizing the throughput and scalability of the entire system. © 2006 ACM.

关键词： Distributed computer systems

来源：评论

学校读者我要写书评

暂无评论

Exploiting regularities for migration decisions in a metacomputer environment

Exploiting regularities for migration decisions in a metacom...

引用

International Conference and Exhibition on High-Performance computing and Networking, HPCN EUROPE 1996

作者： Gehring, Jörn Reinefeld, Alexander Paderborn Center for Parallel Computing Germany

ISBN: (纸本)9783540611424

We present a scheme that derives task migration decisions for a WAN-metacomputer environment based on previously acquired information on a program's runtime behavior and the current network and computing load. Our basic idea is to utilize information gathered in previous execution runs for improving future migration decisions. © Springer-Verlag Berlin Heidelberg 1996.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：