检索结果-内蒙古大学图书馆

REQUIREMENTS FOR DATA-PARALLEL PROGRAMMING ENVIRONMENTS

IEEE PARALLEL & DISTRIBUTED TECHNOLOGY 1994年第3期2卷 48-58页

作者： ADVE, V CARLE, A GRANSTON, E HIRANANDANI, S KENNEDY, K KOELBEL, C KREMER, U MELLORCRUMMEY, J WARREN, S TSENG, CW RICE UNIV DEPT COMP SCIHOUSTONTX 77005 SILICON GRAPH COMP SYST MT VIEWCA 94039 STANFORD UNIV CTR INTEGRATED SYSTCOMP SYST LABSTANFORDCA 94305

An effective data-parallel programming environment will use a variety of tools that support the development of efficient data-parallel programs while insulating the programmer from the intricacies of the explicitly pa... 详细信息

关键词： Programming environments Programming profession Parallel programming Concurrent computing Parallel processing Computer languages Debugging Insulation Aggregates optimizing compilers

来源：评论

学校读者我要写书评

暂无评论

COMBINING ANALYSES, COMBINING OPTIMIZATIONS

引用

ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS 1995年第2期17卷 181-196页

作者： CLICK, C COOPER, KD Rice Univ. Houston TX

Modern optimizing compilers use several passes over a program's intermediate representation to generate good code. Many of these optimizations exhibit a phase-ordering problem. Getting the best code may require iterating optimizations until a fixed point is reached. Combining these phases can lead to the discovery of more facts about the program, exposing more opportunities for optimization. This article presents a framework for describing optimizations. It shows how to combine two such frameworks and how to reason about the properties of the resulting framework. The structure of the framework provides insight into when a combination yields better results. To make the ideas more concrete, this article presents a framework for combining constant propagation, value numbering, and unreachable-code elimination. It is an open question as to what other frameworks can be combined in this way.

关键词： ALGORITHMS THEORY CONSTANT PROPAGATION DATA-FLOW ANALYSIS optimizing compilers VALUE NUMBERING

来源：评论

学校读者我要写书评

暂无评论

PC software performance tuning

引用

COMPUTER 1996年第8期29卷 47-&页

作者： Atkins, M Subramaniam, R Intel Corp Santa Clara CA USA

PC hardware doubles in processing power every two years, or with each new generation, at approximately constant price. But software has not. Sixteen-bit code developed in the 1980s or early '90s maybe slowed two to 20 times by I/O bottlenecks like VGA graphics, artificial data dependencies, poor memory use, obsolete compilers and libraries, and a host of other factors. Software can be designed to scale more readily with greater hardware power, but programmers typically do not profile their code unless it runs ''too slowly.'' Modern compilers provide excellent executables, but developers must choose the best settings of compiler switches, profilers, and optimized runtime libraries. They must also understand the intricacies and idiosyncrasies of the target hardware-in this case, the Intel 486, Pentium, Pentium Pro processors, and the new MMX technology. They must also consider what types of algorithms lend themselves to optimization and what code optimization techniques are most effective. We consider each of these issues before describing a profiling tool called VTune.

关键词： Software performance Clocks Hardware Registers Graphics optimizing compilers Pipeline processing Software tools Personal communication networks Reluctance generators

来源：评论

学校读者我要写书评

暂无评论

DESIGN AND APPLICATION OF AN optimizing XROM SILICON COMPILER

引用

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 1989年第12期8卷 1267-1275页

作者： LINDERMAN, RW ROSSBACH, PC GALLAGHER, DM USAF INST TECHNOLDEPT ELECT & COMP ENGNWRIGHT PATTERSON AFBOH 45433

It is demonstrated that optimization techniques incorporated within a silicon compiler for read-only memories (ROMs) can achieve significant yield, power, and speed improvements by minimizing the number of transistors, drains, and metal interconnections in the ROM. Transistor minimization adopts a heuristic solution to the NP-complete graph partitioning problem with a powerful technique applicable to various ROM design styles and technologies. If diffusion mask personalization is permitted, the design can be further improved by solving the traveling salesman problem to minimize transistor source/drain regions. In table look-up ROMs compiled for 3- and 1.2- mu m CMOS with diffusion mask programming, the compiler eliminated over 45% of the transistors and drains. Test results show that 3- mu m CMOS ROMs have access times between 50 and 70 ns. ROMs with 1.2- mu m features achieve simulated access times below 20 ns. A simple interface allows the optimizing compiler to work easily with other CAD tools such as microcode assemblers.< >

关键词： Design optimization optimizing compilers Silicon compiler Read only memory Programmable logic arrays Testing Chip scale packaging Decoding CMOS technology

来源：评论

学校读者我要写书评

暂无评论

Embedded tools for a configurable and customizable DSP architecture

引用

IEEE DESIGN & TEST OF COMPUTERS 2002年第6期19卷 27-35页

作者： Liem, C Breant, F Jadhav, S O'Farrell, R Ryan, R Levia, O Improv Syst Compiler & Dev Tools Grp San Jose CA 95129 USA Improv Syst Dev Tools Team San Jose CA 95129 USA Improv Syst Field Operat San Jose CA 95129 USA

This embedded tool suite lets users make architectural changes to a programmable DSP core on three levels and supports designer-defined instructions and computation units. The entire system is based on configurability... 详细信息

关键词： Digital signal processing Computer architecture Computer interfaces Computer aided instruction Space exploration Application software Debugging Embedded computing High level languages optimizing compilers

来源：评论

学校读者我要写书评

暂无评论

CLASSIFYING SOFTWARE FOR REUSABILITY

引用

IEEE SOFTWARE 1987年第1期4卷 6-16页

作者： PRIETODIAZ, R FREEMAN, P UNIV CALIF IRVINE DEPT INFORMAT & COMP SCIIRVINECA 92717

Focuses on the classification scheme and retrieval problem of computer software for reusability in Japan. Development of the technique for reusing software components; Steps of code reuse; Levels of reuse of component.

关键词： Software reusability Software libraries Operating systems optimizing compilers Distributed databases Real time systems Software packages

来源：评论

学校读者我要写书评

暂无评论

HOT CHIPS AND SOGGY SOFTWARE - RISC SUCCESS SPRINGS PARTIALLY FROM GOOD SYSTEM-DESIGN - TAKE NOTE AND ELIMINATE THE SOFTWARE BOTTLENECK FROM YOUR NEW DESIGN

引用

IEEE MICRO 1990年第1期10卷 23-26页

作者： JOHNSON, SC Stardent Computer Corporation USA

The author discusses the bottlenecks that impair performance of a computer system and discusses the success of the RISC (reduced-instruction-set computer) approach. He attributes it, at least in part, to the fact that all the seminal work on the RISC chips was carried out in close conjunction with a strong compiler team. He discusses issues that designers of computer systems must consider and examines trends that will affect the optimum design points for future systems. The author then addresses what he refers to as 'soggy software', i.e. the slow pace of progress in software development as compared to hardware, identifying standardization and reuse as necessary components of any solution to the problem.

关键词： Hardware Software design Costs Testing Reduced instruction set computing Springs Semiconductor device measurement Software measurement Computational modeling optimizing compilers

来源：评论

学校读者我要写书评

暂无评论

The Fortran I compiler

引用

COMPUTING IN SCIENCE & ENGINEERING 2000年第1期2卷 70-75页

作者： Padua, D Univ Illinois Urbana IL 61801 USA

The Fortran I compiler was the first demonstration that if is possible to automatically generate efficient machine code from high-level languages. It has thus been enormously influential. This article presents a brief description of the techniques used in the Fortran I compiler for the parsing of expressions, loop optimization, and register allocation.

关键词： Program processors High level languages optimizing compilers Algorithm design and analysis Books Java Programming profession Performance analysis

来源：评论

学校读者我要写书评

暂无评论

An I/O-conscious tiling strategy for disk-resident data sets

引用

JOURNAL OF SUPERCOMPUTING 2002年第3期21卷 257-284页

作者： Kandemir, M Choudhary, A Ramanujam, J Penn State Univ CSE Dept University Pk PA 16802 USA Northwestern Univ ECE Dept Evanston IL 60208 USA Louisiana State Univ ECE Dept Baton Rouge LA 70803 USA

This paper describes a tiling technique that can be used by application programmers and optimizing compilers to obtain I/O-efficient versions of regular scientific loop nests. Due to the particular characteristics of I/O operations, a straightforward extension of the traditional tiling method to I/O-intensive programs may result in poor I/O performance. Therefore, the technique presented in this paper adapts iteration space tiling for I/O-performing loop nests to deliver high I/O performance. The generated code results in huge savings in the number of I/O calls as well as the volume of data transferred between the disk subsystem and main memory. Our experimental results on the IBM SP-2 distributed-memory message-passing multiprocessor demonstrate that the reduction in these two parameters, namely, the number of I/O calls and the transferred data volume, can lead to a marked decrease in overall execution times of I/O-intensive loop nests. In a number of loop nests extracted from several benchmarks and math libraries, we were able to improve the execution times by an average 42.5% for one data set and by an average 47.4% for another.

关键词： optimizing compilers I/O-intensive codes tiling file layouts disk-resident arrays

来源：评论

学校读者我要写书评

暂无评论

Improved Basic Block Reordering

引用

IEEE TRANSACTIONS ON COMPUTERS 2020年第12期69卷 1784-1794页

作者： Newell, Andy Pupyrev, Sergey Facebook Inc Menlo Pk CA 94025 USA

Basic block reordering is an important step for profile-guided binary optimization. The state-of-the-art goal for basic block reordering is to maximize the number of fall-through branches. However, we demonstrate that such orderings may impose suboptimal performance on instruction and I-TLB caches. We propose a new algorithm that relies on a model combining the effects of fall-through and caching behavior. As details of modern processor caching is quite complex and often unknown, we show how to use machine learning in selecting parameters that best trade off different caching effects to maximize binary performance. An extensive evaluation on a variety of applications, including Facebook production workloads, the open-source compilers Clang and GCC, and SPEC CPU benchmarks, indicate that the new method outperforms existing block reordering techniques, improving the resulting performance of applications with large code size. We have open sourced the code of the new algorithm as a part of a post-link binary optimization tool, BOLT.

关键词： Optimization Fasteners Measurement Machine learning Facebook Central Processing Unit Tools Code generation code layout optimizing compilers profile-guided optimizations graph algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：