检索结果-内蒙古大学图书馆

conference on programming language design and implementation

作者： Morita, Kazutaka Morihata, Akimasa Matsuzaki, Kiminori Hu, Zhenjiang Takeichi, Masato Univ Tokyo Grad Sch Informat Sci & Technol Tokyo Japan

ISBN: (纸本)9781595936332

Divide-and-conquer algorithms are suitable for modern parallel machines, tending to have large amounts of inherent parallelism and working well with caches and deep memory hierarchies. Among others, list homomorphisms are a class of recursive functions on lists, which match very well with the divide-and-conquer paradigm. However, direct programming with list homomorphisms is a challenge for many programmers. In this paper, we propose and implement a novel system that can automatically derive cost-optimal list homomorphisms from a pair of sequential programs, based on the third homomorphism theorem. Our idea is to reduce extraction of list homomorphisms to derivation of weak right inverses. We show that a weak right inverse always exists and can be automatically generated from a wide class of sequential programs. We demonstrate our system with several nontrivial examples, including the maximum prefix sum problem, the prefix sum computation, the maximum segment sum problem, and the line-of-sight problem. The experimental results show practical efficiency of our automatic parallelization algorithm and good speedups of the generated parallel programs.

关键词： algorithms design languages divide-and-conquer parallelism inversion program transformation third homomorphism theorem

来源：评论

学校读者我要写书评

暂无评论

EXOCHI: Architecture and programming environment for a heterogeneous multi-core multithreaded system

引用

acm sigplan NOTICES 2007年第6期42卷 156-166页

作者： Wang, Perry H. Collins, Jamison D. Chinya, Gautham N. Hong Jiang Xinmin Tian Girkar, Milind Yang, Nick Y. Lueh, Guei-Yuan Wang, Hong Intel Corp Microprocessor Technol Labs Microarchitecture Res Lab Santa Clara CA 95051 USA Intel Corp Chipset Grp Santa Clara CA 95051 USA Intel Corp Software Solut Corp Intel Compiler Lab Santa Clara CA 95051 USA

Future mainstream microprocessors will likely integrate specialized accelerators, such as GPUs, onto a single die to achieve better performance and power efficiency. However, it remains a keen challenge to program such a heterogeneous multi-core platform, since these specialized accelerators feature ISAs and functionality that are significantly different from the general purpose CPU cores. In this paper, we present EXOCHI: (1) Exoskeleton Sequencer (EXO), an architecture to represent heterogeneous accelerators as ISA-based MIMD architecture resources, and a shared virtual memory heterogeneous multithreaded program execution model that tightly couples specialized accelerator cores with general purpose CPU cores, and (2) C for Heterogeneous Integration (CHI), an integrated C/C++ programming environment that supports accelerator-specific inline assembly and domain-specific languages. The CHI compiler extends the OpenMP pragma for heterogeneous multithreading programming, and produces a single fat binary with code sections corresponding to different instruction sets. The runtime can judiciously spread parallel computation across the heterogeneous cores to optimize performance and power. We have prototyped the EXO architecture on a physical heterogeneous platform consisting of an Intel (R) Core (TM) 2 Duo processor and an 8-core 32-thread Intel (R) Graphics Media Accelerator X3000. In addition, we have implemented the CHI integrated programming environment with the Intel (R) C++ Compiler, runtime toolset, and debugger. On the EXO prototype system, we have enhanced a suite of production-quality media kernels for video and image processing to utilize the accelerator through the CHI programming interface, achieving significant speedup (1.41X to 10.97X) over execution on the IA32 CPU alone.

关键词： performance design languages heterogeneous multi-cores GPU OpenMP

来源：评论

学校读者我要写书评

暂无评论

The exo VM system for automatic VM and application reduction 07

The exo VM system for automatic VM and application reduction

引用

conference on programming language design and implementation

作者： Titzer, Ben L. Auerbach, Joshua Bacon, David F. Palsberg, Jens Univ Calif Los Angeles Compilers Grp Los Angeles CA 90025 USA

ISBN: (纸本)9781595936332

Embedded systems pose unique challenges to Java application developers and virtual machine designers. Chief among these challenges is the memory footprint of both the virtual machine and the applications that run within it. With the rapidly increasing set of features provided by the Java language, virtual machine designers are often forced to build custom implementations that make various tradeoffs between the footprint of the virtual machine and the subset of the Java language and class libraries that are supported. In this paper, we present the Exo VM, a system in which an application is initialized in a fully featured virtual machine, and then the code, data, and virtual machine features necessary to execute it are packaged into a binary image. Key to this process is feature analysis, a technique for computing the reachable code and data of a Java program and its implementation inside the VM simultaneously. The Exo VM reduces the need to develop customized embedded virtual machines by reusing a single VM infrastructure and automatically eliding the implementation of unused Java features on a per-program basis. We present a constraint-based instantiation of the analysis technique, an implementation in IBM's J9 Java VM, experiments evaluating our technique for the EEMBC benchmark suite, and some discussion of the individual costs of some of Java's features. Our evaluation shows that our system can reduce the non-heap memory allocation of the virtual machine by as much as 75%. We discuss VM and language design decisions that our work shows are important in targeting embedded systems, supporting the long-term goal of a common VM infrastructure spanning from motes to large servers.

关键词： pre-initialization embedded systems persistence dead code elimination static compilation static analysis VM design VM modularity feature analysis

来源：评论

学校读者我要写书评

暂无评论

Fault-tolerant typed assembly language 07

Fault-tolerant typed assembly language

引用

conference on programming language design and implementation

作者： Perry, Frances Mackey, Lester Reis, George A. Ligatti, Jay August, David I. Walker, David Princeton Univ Dept Comp Sci & Elect Engn Princeton NJ 08544 USA Univ S Florida Dept Comp Sci & Comp Engn Tampa FL 33620 USA

ISBN: (纸本)9781595936332

A transient hardware fault occurs when an energetic particle strikes a transistor, causing it to change state. Although transient faults do not permanently damage the hardware, they may corrupt computations by altering stored values and signal transfers. In this paper, we propose a new scheme for provably safe and reliable computing in the presence of transient hardware faults. In our scheme, software computations are replicated to provide redundancy while special instructions compare the independently computed results to detect errors before writing critical data. In stark contrast to any previous efforts in this area, we have analyzed our fault tolerance scheme from a formal, theoretical perspective. To be specific, first, we provide an operational semantics for our assembly language, which includes a precise formal definition of our fault model. Second, we develop an assembly-level type system designed to detect reliability problems in compiled code. Third, we provide a formal specification for program fault tolerance under the given fault model and prove that all well-typed programs are indeed fault tolerant. In addition to the formal analysis, we evaluate our detection scheme and show that it only takes 34% longer to execute than the unreliable version.

关键词： languages reliability theory verification transient hardware faults soft faults fault tolerance type systems typed assembly language

来源：评论

学校读者我要写书评

暂无评论

Enforcing isolation and ordering in STM 07

Enforcing isolation and ordering in STM

引用

conference on programming language design and implementation

作者： Shpeisman, Tatiana Menon, Vijay Adl-Tabatabai, Ali-Reza Balensiefer, Steven Grossman, Dan Hudson, Richard L. Moore, Katherine F. Saha, Bratin Intel Corp Programming Syst Lab Santa Clara CA 95054 USA Univ Washington Dept Comp Sci & Engn Seattle WA 98195 USA

ISBN: (纸本)9781595936332

Transactional memory provides a new concurrency control mechanism that avoids many of the pitfalls of lock-based synchronization. High-performance software transactional memory (STM) implementations thus far provide weak atomicity: Accessing shared data both inside and outside a transaction can result in unexpected, implementation-dependent behavior. To guarantee isolation and consistent ordering in such a system, programmers are expected to enclose all shared-memory accesses inside transactions. A system that provides strong atomicity guarantees isolation even in the presence of threads that access shared data outside transactions. A strongly-atomic system also orders transactions with conflicting non-transactional memory operations in a consistent manner. In this paper, we discuss some surprising pitfalls of weak atomicity, and we present an STM system that avoids these problems via strong atomicity. We demonstrate how to implement non-transactional data accesses via efficient read and write barriers, and we present compiler optimizations that further reduce the overheads of these barriers. We introduce a dynamic escape analysis that differentiates private and public data at runtime to make barriers cheaper and a static not-accessed-in-transaction analysis that removes many barriers completely. Our results on a set of Java programs show that strong atomicity can be implemented efficiently in a high-performance STM system.

关键词： algorithms measurement performance design experimentation languages transactional memory strong atomicity weak atomicity isolation ordering escape analysis compiler optimizations code generation virtual machines

来源：评论

学校读者我要写书评

暂无评论

EXOCHI: Architecture and programming Environment for A Heterogeneous Multi-core Multithreaded System 07

EXOCHI: Architecture and Programming Environment for A Heter...

引用

conference on programming language design and implementation

作者： Wang, Perry H. Collins, Jamison D. Chinya, Gautham N. Jiang, Hong Tian, Xinmin Girkar, Milind Yang, Nick Y. Lueh, Guei-Yuan Wang, Hong Intel Corp Microprocessor Technol Labs Microarchitecture Res Lab Santa Clara CA 95051 USA

ISBN: (纸本)9781595936332

Future mainstream microprocessors will likely integrate specialized accelerators, such as GPUs, onto a single die to achieve better performance and power efficiency. However, it remains a keen challenge to program such a heterogeneous multi-core platform, since these specialized accelerators feature ISAs and functionality that are significantly different from the general purpose CPU cores. In this paper. we present EXOCHI: (1) Exoskeleton Sequencer (EXO), an architecture to represent heterogeneous accelerators as ISA-based MIMD architecture resources, and a shared virtual memory heterogeneous multithreaded program execution model that tightly couples specialized accelerator cores with general purpose CPU cores, and (2) C for Heterogeneous Integration (CHI), an integrated C/C++ programming environment that supports accelerator-specific inline assembly and domain-specific languages. The CHI compiler extends the OpenMP pragma for heterogeneous multithreading programming, and produces a single fat binary with code sections corresponding to different instruction sets. The runtime can judiciously spread parallel Computation across the heterogenous cores to optimize performance and power. We have prototyped the EXO architecture on a physical heterogeneous platform consisting of an Intel (R) Core (TM) 2 Duo Processor and an 8-core 32-thread Intel (R) Graphics Media Accelerator X3000. In addition. we have implemented the CHI integrated programming environment with the Intel (R) C++ Compiler, runtime toolset. and debugger. On the EXO prototype system, we have enhanced a suite of production-quality media kernels for video and image processing to utilize the accelerator through the CHI programming interface, achieving significant speedup (1.41x to 10.97x) over execution on the IA32 CPU alone.

关键词： Heterogeneous multi-cores GPU OpenMP

来源：评论

学校读者我要写书评

暂无评论

Proceedings of the acm sigplan conference on programming language design and implementation (pldi): Foreword

Proceedings of the ACM SIGPLAN Conference on Programming Lan...

引用

Proceedings of the acm sigplan conference on programming language design and implementation (pldi) 2005年 iii页

作者： Hall, Mary Sarkar, Vivek USC ISI IBM Research

No abstract available

关键词：

来源：评论

学校读者我要写书评

暂无评论

Effective static race detection for Java 06

Effective static race detection for Java

引用

acm sigplan conference on programming language design and implementation, pldi 2006 - PLAS 2006: 2006 programming languages and Analysis for Security Workshop

作者： Naik, Mayur Aiken, Alex Whaley, John Computer Science Department Stanford University United States

ISBN: (纸本)1595933743

We present a novel technique for static race detection in Java programs, comprised of a series of stages that employ a combination of static analyses to successively reduce the pairs of memory accesses potentially involved in a race. We have implemented our technique and applied it to a suite of multi-threaded Java programs. Our experiments show that it is precise, scalable, and useful, reporting tens to hundreds of serious and previously unknown concurrency bugs in large, widely-used programs with few false alarms. Copyright © 2006 acm.

关键词： Java programming language

来源：评论

学校读者我要写书评

暂无评论

The ATOMOσ transactional programming language

The ATOMOσ transactional programming language

引用

pldi 2006 - 2006 acm sigplan conference on programming language design and implementation

作者： Carlstrom, Brian D. McDonald, Austen Chafi, Hassan Chung, JaeWoong Minh, Chi Cao Kozyrakis, Christos Olukotun, Kunle Computer Systems Laboratory Stanford University

ISBN: (纸本)1595933204

Atomos is the first programming language with implicit transactions, strong atomicity, and a scalable multiprocessor implementation. Atomos is derived from Java, but replaces its synchronization and conditional wailing constructs with simpler transactional alternatives. The Atomos watch statement allows programmers to specify fine-grained watch sets used with the Atomos retry conditional wailing statement for efficient transactional conflict-driven wakeup even in transactional memory systems with a limited number of transactional contexts. Aloinos supports open-nested transactions, which are necessary for building both scalable application programs and virtual machine implementations. The implementation of the Atomos scheduler demonstrates the use of open nesting within the virtual machine and introduces the concept of transactional memory violation handlers that allow programs to recover from data dependency violations without rolling back. Atomos programming examples are given to demonstrate the usefulness of transactional programming primitives. Atomos and Java are compared through the use of several benchmarks. The results demonstrate both the improvements in parallel programming ease and parallel program performance provided by Atomos. Copyright © 2006 acm.

关键词： Computer programming languages

来源：评论

学校读者我要写书评

暂无评论

Termination proofs for systems code 06

Termination proofs for systems code

引用

pldi 2006 - 2006 acm sigplan conference on programming language design and implementation

作者： Cook, Byron Podelski, Andreas Rybalchenko, Andrey Microsoft Research Max-Planck-Institut für Informatik Germany Max-Planck-Institut für Informatik EPFL Germany

ISBN: (纸本)1595933204

Program termination is central to the process of ensuring that systems code can always react. We describe a new program termination prover that performs a path-sensitive and context-sensitive program analysis and provides capacity for large program fragments (i.e. more than 20,000 lines of code) together with support for programming language features such as arbitrarily nested loops, pointers, function-pointers, side-effects, etc. We also present experimental results on device driver dispatch routines from the Windows operating system. The most distinguishing aspect of our tool is how it shifts the balance between the two tasks of constructing and respectively checking the termination argument. Checking becomes the hard step. In this paper we show how we solve the corresponding challenge of checking with binary reachability analysis. Copyright © 2006 acm.

关键词： Computer programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：