检索结果-内蒙古大学图书馆

Polyhedral Bubble Insertion: A Method to Improve Nested Loop Pipelining for High-Level Synthesis

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 2013年第3期32卷 339-352页

作者： Morvan, Antoine Derrien, Steven Quinton, Patrice Inst Natl Rech Informat & Automat F-35000 Rennes France Univ Rennes 1 F-35000 Rennes France Ecole Normale Super F-35170 Bruz France

High-level synthesis (HLS) allows hardware to be directly produced from behavioral description in C/C++, thus accelerating the design process. Loop pipelining is a key transformation of HLS, as it improves the throughput of the design at the price of a small hardware overhead. However, for small loops, its use often results in a poor hardware utilization due to the pipeline latency overhead. Overlapping the iterations of the whole loop nest instead of only overlapping the innermost loop is a way to overcome this difficulty, but currently available techniques are restricted to perfectly nested loops with constant bounds, involving uniform dependences only. Using the polyhedral model, we extend the applicability of the nested loop pipelining transformation by proposing a new legality check and a new loop correction technique, called polyhedral bubble insertion. This method was implemented in a source-to-source compiler targeting HLS, and results on benchmark kernels show that polyhedral bubble insertion is effective in practice on a much larger class of loop nests.

关键词： High-level synthesis (HLS) loop coalescing nested loop pipelining polyhedral model source-to-source transformations

来源：评论

学校读者我要写书评

暂无评论

Global transformations for legacy parallel applications via structural analysis and rewriting

引用

PARALLEL COMPUTING 2015年 43卷 1-26页

作者： Chavarria-Miranda, Daniel Panyala, Ajay Ma, Wenjing Prantl, Adrian Krishnamoorthy, Sriram Pacific NW Natl Lab High Performance Comp Richland WA 99352 USA Louisiana State Univ Sch Elect Engn & Comp Sci Comp Sci & Engn Div Baton Rouge LA 70803 USA Lawrence Livermore Natl Lab Ctr Appl Sci Comp Livermore CA USA

Performance and scalability optimization of large HPC applications is currently a labor-intensive, manual process with very low productivity. Major difficulties come from the disaggregated environment for HPC application development: the compiler is only involved in local decisions (core or multithreaded domain), while a library-based, communication-oriented programming model realizes whole-machine parallelism. Realizing any major global change in such a disaggregated environment is very difficult and involves changing large portions of the source code. We present semi-automated techniques, based on structural analysis and rewriting, for performing global transformations on an HPC application source code. We present two case studies using the Self-Consistent Field (SCF) standalone benchmark as well as the Coupled Cluster (CCSD) module (2.9 million lines of Fortran code), a key module of the NWChem computational chemistry application. We demonstrate how structural rewriting techniques can be used to automate transformations that affect multiple sections of the application's source code. We show that the transformations can be applied in a systematic fashion across the source code bases with minimal manual effort. These transformations improve the scalability of the SCF benchmark by more than two orders of magnitude and the performance of the full CCSD module by a factor of four. (C) 2015 Elsevier B.V. All rights reserved.

关键词： source-to-source transformations Term rewriting Semantic patches Legacy parallel applications

来源：评论

学校读者我要写书评

暂无评论

source-to-source Optimization of CUDA C for GPU Accelerated Cardiac Cell Modeling

Source-to-Source Optimization of CUDA C for GPU Accelerated ...

引用

16th International Euro-Par Conference on Parallel Processing

作者： Lionetti, Fred V. McCulloch, Andrew D. Baden, Scott B. Univ Calif San Diego Dept Comp Engn & Sci 9500 Gilman Dr La Jolla CA 92093 USA Univ Calif San Diego Dept Bioengn La Jolla CA 92093 USA

ISBN: (纸本)9783642152764

Large and complex systems of ordinary differential equations (ODEs) arise in diverse areas of science and engineering, and pose special challenges On a streaming processor owing to the large amount of state they manipulate. We describe a set of domain-specific source transformations on CUDA C that improved performance by x6.7 on a system of ODEs arising in cardiac electrophysiology running on the nVidia GTX-295, without requiring expert knowledge of the CPU. Our transformations should apply to a wide range of reaction-diffusion systems.

关键词： Automatic code generation source-to-source transformations optimization GPU CUDA cardiac cell modeling ODEs

来源：评论

学校读者我要写书评

暂无评论

Bee+Cl@k: An implementation of lattice-based array contraction in the source-to-source translator ROSE

Bee+Cl@k: An implementation of lattice-based array contracti...

引用

Conference on Languages, Compilers and Tools for Embedded Systems

作者： Alias, Christophe Baray, Fabrice Darte, Alain CNRS ENS Lyon UCB Lyon LIP F-75700 Paris France

ISBN: (纸本)1595936327

We build on prior work on intra-array memory reuse, for which a general theoretical framework was proposed based on lattice theory. Intra-array memory reuse is a way of reducing the size of a temporary array by folding, thanks to a. ne mappings and modulo operations, reusing memory locations when they contain a value not used later. We describe the algorithms needed to implement such a strategy. Our implementation has two parts. The first part, Bee, uses the source-to-source transformer ROSE to extract from the program all necessary information on the lifetime of array elements and to generate the code after memory reduction. The second part, Cl@k, is a stand-alone mathematical tool dedicated to optimizations on polyhedra, in particular the computation of successive minima and the computation of good admissible lattices, which are the basis for lattice-based memory reuse. Both tools are developed in C++ and use linear programming and polyhedra manipulations. They can be used either for embedded program optimizations, e. g., to limit memory expansion introduced for parallelization, or in high-level synthesis, e. g., to design memories between communicating hardware accelerators.

关键词： memory reduction source-to-source transformations program analysis lattices algorithms experimentation theory

来源：评论

学校读者我要写书评

暂无评论

CheckPointer A C Memory Access Validator

CheckPointer A C Memory Access Validator

引用

11th IEEE International Working Conference on source Code Analysis and Manipulation/IEEE International Conference on Software Maintenance/IEEE International Symposium on Web Systems Evolution/VISSOFT/MESOCA

作者： Mehlich, Michael Semant Designs Inc Austin TX USA

ISBN: (纸本)9780769543475

CheckPointer is a memory access validator for checking spatial and temporal pointer usage errors in multi-threaded applications by tracking meta data and validating pointer dereferences at run-time. The tool uses source-to-source transformations implemented with DMS to instrument the source code of the application to be validated with meta data checks. Libraries available only in binary form are handled by using function wrappers that check meta data immediately before calling a library function and update meta data as necessary immediately after the library function returns.

关键词： CheckPointer memmy safety memmy debugger out-of-bounds error pointer error memmy access error instrumentation source-to-source transformations DMS

来源：评论

学校读者我要写书评

暂无评论

A COMPILER APPROACH TO SCALABLE CONCURRENT-PROGRAM DESIGN

引用

ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS 1994年第3期16卷 577-604页

作者： FOSTER, I TAYLOR, S CALTECH PASADENACA 91125

We describe a compilation system for the concurrent programming language Program Composition Notation (PCN). This notation provides a single-assignment programming model that permits concurrent-programming concerns such as decomposition, communication, synchronization, mapping, granularity, and load balancing to be addressed separately in a design. PCN is also extensible with programmer-defined operators, allowing common abstractions to be encapsulated and reused in different contexts. The compilation system incorporates a concurrent-transformation system that allows abstractions to be defined through concurrent source-to-source transformations;these convert programmer-defined operators into a core notation. Run-time system techniques allow the core notation to be compiled into a simple concurrent abstract machine which can be implemented in a portable fashion using a run-time library. The abstract machine provides a uniform treatment of single-assignment and mutable data structures, allowing data sharing between concurrent and sequential program segments and permitting integration of sequential C and Fortran code into concurrent programs. This compilation system forms part of a program development toolkit that operates on a wide variety of networked workstations, multicomputers, and shared-memory multiprocessors. The toolkit has been used both to develop substantial applications and to teach introductory concurrent-programming classes, including a freshman course at Caltech.

关键词： DESIGN LANGUAGES MONOTONICITY PROGRAM COMPOSITION PROGRAMMING ABSTRACTIONS source-to-source transformations

来源：评论

学校读者我要写书评

暂无评论

Bee+Cl@k: An implementation of lattice-based array contraction in the source-to-source translator ROSE

引用

ACM SIGPLAN NOTICES 2007年第7期42卷 73-82页

作者： Alias, Christophe Baray, Fabrice Darte, Alain CNRS ENS Lyon UCB Lyon LIP F-75700 Paris France

关键词： memory reduction source-to-source transformations program analysis lattices algorithms experimentation theory

来源：评论

学校读者我要写书评

暂无评论

PARSE-TREE ANNOTATIONS

引用

COMMUNICATIONS OF THE ACM 1989年第12期32卷 1467-1477页

作者： PURTILO, JJ CALLAHAN, JR Univ. of Maryland College Park Univ. of Maryland College Park

Describes a technique for associating rewrite rules with grammar productions so that many high-level transformations of a source file can be generated easily. Sufficiency of the approach in power to deal with a wide class of problems arising from practical applications; Construction of a notation with associated language processing tools for annotating a parse tree with transformation rules.

关键词： Attribute grammars parsers software interconnection systems source-to-source transformations

来源：评论

学校读者我要写书评

暂无评论

[email protected]: an implementation of lattice-based array contraction in the source-to-source translator rose 07

[email protected]: an implementation of lattice-based array ...

引用

Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems

作者： Christophe Alias Fabrice Baray Alain Darte LIP: CNRS - ENS Lyon - UCB Lyon - INRIA Lyon France

ISBN: (纸本)9781595936325

We build on prior work on intra-array memory reuse, for which a general theoretical framework was proposed based on lattice theory. Intra-array memory reuse is a way of reducing the size of a temporary array by folding, thanks to affine mappings and modulo operations, reusing memory locations when they contain a value not used later. We describe the algorithms needed to implement such a strategy. Our implementation has two parts. The first part, Bee, uses the source-to-source transformer ROSE to extract from the program all necessary information on the lifetime of array elements and to generate the code after memory reduction. The second part, [email protected], is a stand-alone mathematical tool dedicated to optimizations on polyhedra, in particular the computation of successive minima and the computation of good admissible lattices, which are the basis for lattice-based memory reuse. Both tools are developed in C++ and use linear programming and polyhedra manipulations. They can be used either for embedded program optimizations, e.g., to limit memory expansion introduced for parallelization, or in high-level synthesis, e.g., to design memories between communicating hardware accelerators.

关键词： lattices source-to-source transformations memory reduction program analysis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：