检索结果-内蒙古大学图书馆

14th ACM SIGPLAN symposium on principles and practice of parallel programming

作者： Huang, Lei Eachempati, Deepak Hervey, Marcus W. Chapman, Barbara Univ Houston Dept Comp Sci Houston TX 77004 USA

ISBN: (纸本)9781605583976

the advent of new parallel architectures has increased the need for parallel optimizing compilers to assist developers in creating efficient code. OpenUH is a state-of-the-art optimizing compiler, but it only performs a limited set of optimizations for OpenMP programs due to its conservative assumptions of shared memory programming. these limitations may prevent some OpenMP applications from being fully optimized to the extent of its sequential counterpart. this paper describes our design and implementation of a parallel data flow framework, consisting of a parallel Control Flow Graph (PCFG) and a parallel SSA (PSSA) representation in OpenUH, to model data flow for OpenMP programs. this framework enables the OpenUH compiler to perform all classical scalar optimizations for OpenMP programs, in addition to conducting OpenMP specific optimizations.

关键词： Language Performance theory Compiler Analysis OpenMP parallel SSA

来源：评论

学校读者我要写书评

暂无评论

parallel 'go with the winners' algorithms in distributed memory models

引用

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2003年第9期63卷 801-814页

作者： Peinado, M Lengauer, T Microsoft Corp Redmond WA 98052 USA Max Planck Inst Comp Sci D-66123 Saarbrucken Germany

We parallelize the 'go with the winners' algorithm of Aldous and Vazirani (in: proceedings of the 35th IEEE symposium on the Foundations of Computer Science, IEEE Computer Society Press, Silver Spring., MD, 1994, pp. 492-501) and analyze the resulting parallel algorithm in the LogP-model (in: proceedings of the Fourth ACM SIGPLAN symposium on principles & practice of parallel programming, 1993, pp. 1-12). the main issues in the analysis are load imbalances and communication delays. the result of the analysis is a practical algorithm which, under reasonable assumptions, achieves linear speedup. Finally, we analyze our algorithm for a concrete application: generating models of amorphous chemical structures. (C) 2003 Elsevier Inc. All rights reserved.

关键词： parallel algorithms go with the winners LogP model second moment method load balancing amorphous solids self-avoiding walks

来源：评论

学校读者我要写书评

暂无评论

parallel skeletons for structured composition

Parallel skeletons for structured composition

引用

proceedings of the 5th ACM SIGPLAN symposium on principles and practice of parallel programming

作者： Darlington, John Guo, Yi-ke To, Hing Wing Yang, Jin Imperial Coll London United Kingdom

In this paper, we propose a straightforward solution to the problems of compositional parallel programming by using skeletons as the uniform mechanism for structured composition. In our approach parallel programs are constructed by composing procedures in a conventional base language using a set of high-level, predefined, functional, parallel computational forms known as skeletons. the ability to compose skeletons provides us with the essential tools for building further and more complex application-oriented skeletons specifying important aspects of parallel computation. Compared with the process network based composition approach, such as PCN, the skeleton approach abstracts away the fine details of connecting communication ports to the higher level mechanism of making data distributions conform, thus avoiding the complexity of using lower level ports as the means of interaction. thus, the framework provides a natural integration of the compositional programming approach with the data parallel programming paradigm.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Pizza into Java: translating theory into practice

Pizza into Java: translating theory into practice

引用

the 1997 24th ACM SIGPLAN-SIGACT symposium on principles of programming Languages, POPL'97

作者： Odersky, Martin Wadler, Philip Univ of Karlsruhe Karlsruhe Germany

Pizza is a strict superset of Java that incorporates three ideas from the academic community: parametric polymorphism, higher-order functions, and algebraic data types. Pizza is defined by translation into Java and compiles into the Java Virtual Machine, requirements which strongly constrain the design space. Nonetheless, Pizza fits smoothly to Java, with only a few rough edges.

关键词： Object oriented programming

来源：评论

学校读者我要写书评

暂无评论

AUTOMATIC ALIGNMENT OF ARRAY DATA AND PROCESSES TO REDUCE COMMUNICATION TIME ON DMPPS 95

AUTOMATIC ALIGNMENT OF ARRAY DATA AND PROCESSES TO REDUCE CO...

引用

5th ACM SIGPLAN symposium on principles and practice of parallel programming

作者： PHILIPPSEN, M ICSI International Computer Science Institute Berkeley CA and Dept. of Informatics University of Karlsruhe

ISBN: (纸本)9780897917001

this paper investigates the problem of aligning array data and processes in a distributed-memory implementation. We present complete algorithms for compile-time analysis, the necessary program restructuring, and subsequent code-generation, and discuss their complexity. We finally evaluate the practical usefulness by quantitative experiments. the technique presented analyzes complete programs, including branches, loops, and nested parallelism. Alignment is determined with respect to offset, stride, and general ass's relations. Pplacement of both data and processes are computed in a unifying framework based on an extended preference graph and its analysis. Dynamic redistributions are derived. the experimental results are very encouraging. the optimization algorithms implemented in our Modula-2* compiler improved the execution times of the programs by an average over 40% on a MasPar MP-1 with 16384 processors.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Model and compilation strategy for out-of-core data parallel programs

Model and compilation strategy for out-of-core data parallel...

引用

proceedings of the 5th ACM SIGPLAN symposium on principles and practice of parallel programming

作者： Bordawekar, Rajesh Choudhary, Alok Kennedy, Ken Koelbel, Charles Paleczny, Michael Syracuse Univ Syracuse United States

It is widely acknowledged in high-performance computing circles that parallel input/output needs substantial improvement in order to make scalable computers truly usable. We present a data storage model that allows processors independent access to their own data and a corresponding compilation strategy that integrates data-parallel computation with data distribution for out-of-core problems. Our results compare several communication methods and I/O optimizations using two out-of-core problems, Jacobi iteration and LU factorization.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Tiles: A new language mechanism for heterogeneous parallelism 2015

Tiles: A new language mechanism for heterogeneous parallelis...

引用

20th ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP 2015

作者： Chen, Yifeng Cui, Xiang Mei, Hong HCST Key Lab. School of EECS Peking University Beijing100871 China

ISBN: (纸本)9781450332057

this paper studies the essence of heterogeneity from the perspective of language mechanism design. the proposed mechanism, called tiles, is a program construct that bridges two relative levels of computation: an outer level of source data in larger, slower or more distributed memory and an inner level of data blocks in smaller, faster or more localized memory.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Shared-memory performance profiling 97

Shared-memory performance profiling

引用

proceedings of the 1997 6th ACM SIGPLAN symposium on principles and practice of parallel programming

作者： Xu, Zhichen Larus, James R. Miller, Barton P. Univ of Wisconsin Madison WI United States

ISBN: (纸本)9780897919067

this paper describes a new approach to finding performance bottlenecks in shared-memory parallel programs and its embodiment in the Paradyn parallel Performance Tools running with the Blizzard fine-grain distributed shared memory system. this approach exploits the underlying system's cache coherence protocol to detect data sharing patterns that indicate potential performance bottlenecks and presents performance measurements in a data-centric manner. As a demonstration, Paradyn helped us improve the performance of a new shared-memory application program by a factor of four.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

programming with Hardware Lock Elision 13

Programming with Hardware Lock Elision

引用

18th ACM SIGPLAN symposium on principles and practice of parallel programming

作者： Afek, Yehuda Levy, Amir Morrison, Adam Tel Aviv Univ Blavatnik Sch Comp Sci IL-69978 Tel Aviv Israel

We present a simple yet effective technique for improving performance of lock-based code using the hardware lock elision (HLE) feature in Intel's upcoming Haswell processor. We also describe how to extend Haswell&... 详细信息

ISBN: (纸本)9781450319225

关键词： Haswell hardware lock elision speculative execution

来源：评论

学校读者我要写书评

暂无评论

Reducing Contention through Priority Updates 13

Reducing Contention Through Priority Updates

引用

18th ACM SIGPLAN symposium on principles and practice of parallel programming

作者： Shun, Julian Blelloch, Guy E. Fineman, Jeremy T. Gibbons, Phillip B. Carnegie Mellon Univ Pittsburgh PA 15213 USA Georgetown Univ Washington DC 20057 USA Intel Labs Pittsburgh PA USA

No abstract available.

ISBN: (纸本)9781450319225

No abstract available.

关键词： Experimentation Performance parallel programming Contention

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：