Over the last five years, graphics cards have become a tempting target for scientific computing, thanks to unrivaled peak performance, often producing a runtime speed-up of x10 to x25 over comparable CPU solutions. Ho...
详细信息
ISBN:
(纸本)9781450312059
Over the last five years, graphics cards have become a tempting target for scientific computing, thanks to unrivaled peak performance, often producing a runtime speed-up of x10 to x25 over comparable CPU solutions. However, this increase can be difficult to achieve, and doing so often requires a fundamental rethink. This is especially problematic in scientific computing, where experts do not want to learn yet another architecture. In this paper we develop a method for automatically parallelising recursive functions of the sort found in scientific papers. Using a static analysis of the function dependencies we identify sets - partitions - of independent elements, which we use to synthesise an efficient GPU implementation using polyhedral code generation techniques. We then augment our language with DSL extensions to support a wider variety of applications, and demonstrate the effectiveness of this with three case studies, showing significant performance improvement over equivalent CPU methods, and similar efficiency to hand-tuned GPU implementations.
A number of programminglanguages use rich type systems to verify security properties of code. Some of these languages are meant for source programming, but programs written in these languages are compiled without exp...
详细信息
ISBN:
(纸本)9781450300193
A number of programminglanguages use rich type systems to verify security properties of code. Some of these languages are meant for source programming, but programs written in these languages are compiled without explicit security proofs, limiting their utility in settings where proofs are necessary, e.g., proof-carrying authorization. Others languages do include explicit proofs, but these are generally lambda calculi not intended for source programming, that must be further compiled to an executable form. A language suitable for source programming backed by a compiler that enables end-to-end verification is missing. In this paper, we present a type-preserving compiler that translates programs written in FINE, a source-level functional language with dependent refinements and affine types, to DCIL, a new extension of the .NET Common Intermediate language. FINE is type checked using an external SMT solver to reduce the proof burden on source programmers. We extract explicit LCF-style proof terms from the solver and carry these proof terms in the compilation to DCIL, thereby removing the solver from the trusted computing base. Explicit proofs enable DCIL to be used in a number of important scenarios, including the verification of mobile code, proof-carrying authorization, and evidence-based auditing. We report on our experience using FINE to build reference monitors for several applications, ranging from a plugin-based email client to a conference management server.
We present an approach to enriching the type system of ML with a restricted form of dependent types, where type index objects are drawn from a constraint domain C, leading to the DML(C) language schema. This allows sp...
详细信息
We present an approach to enriching the type system of ML with a restricted form of dependent types, where type index objects are drawn from a constraint domain C, leading to the DML(C) language schema. This allows specification and inference of significantly more precise type information, facilitating program error detection and compiler optimization. A major complication resulting from introducing dependent types is that pure type inference for the enriched system is no longer possible, but we show that type-checking a sufficiently annotated program in DML(C) can be reduced to constraint satisfaction in the constraint domain C. We exhibit the unobtrusiveness of our approach through practical examples and prove that DML(C) is conservative over ML. The main contribution of the paper lies in our languagedesign, including the formulation of type-checking rules which makes the approach practical. To our knowledge, no previous type system for a general purpose programminglanguage such as ML has combined dependent types with features including datatype declarations, higher-order functions, general recursions, let-polymorphism, mutable references, and exceptions. In addition, we have finished a prototype implementation of DML(C) for an integer constraint domain C, where constraints are linear inequalities (Xi and Pfenning 1998).
This paper describes a method for compiling programs using interprocedural register allocation. A strategy for handling programs built from multiple modules is presented, as well as algorithms for global variable prom...
详细信息
ISBN:
(纸本)9780897913645
This paper describes a method for compiling programs using interprocedural register allocation. A strategy for handling programs built from multiple modules is presented, as well as algorithms for global variable promotion and register spill code motion. These algorithms attempt to address some of the shortcomings of previous interprocedural register allocation strategies. Results are given for an implementation on a single register file RISC-based architecture.
We describe the automatic generation of a complete, realistic compiler from formal specifications of the syntax and semantics of Sol/C, a nontrivial imperative language "sort of like C." The compiler exhibit...
详细信息
Stan is a probabilistic programminglanguage that is popular in the statistics community, with a high-level syntax for expressing probabilistic models. Stan differs by nature from generative probabilistic programming ...
详细信息
ISBN:
(纸本)9781450383912
Stan is a probabilistic programminglanguage that is popular in the statistics community, with a high-level syntax for expressing probabilistic models. Stan differs by nature from generative probabilistic programminglanguages like Church, Anglican, or Pyro. This paper presents a comprehensive compilation scheme to compile any Stan model to a generative language and proves its correctness. We use our compilation scheme to build two new backends for the Stanc3 compiler targeting Pyro and NumPyro. Experimental results show that the NumPyro backend yields a 2.3x speedup compared to Stan in geometric mean over 26 benchmarks. Building on Pyro we extend Stan with support for explicit variational inference guides and deep probabilistic models. That way, users familiar with Stan get access to new features without having to learn a fundamentally new language.
The Bluespec hardware-description language presents a significantly higher-level view than hardware engineers are used to, exposing a simpler concurrency model that promotes formal proof, without compromising on perfo...
详细信息
ISBN:
(纸本)9781450376136
The Bluespec hardware-description language presents a significantly higher-level view than hardware engineers are used to, exposing a simpler concurrency model that promotes formal proof, without compromising on performance of compiled circuits. Unfortunately, the cost model of Bluespec has been unclear, with performance details depending on a mix of user hints and opaque static analysis of potential concurrency conflicts within a design. In this paper we present Koika, a derivative of Bluespec that preserves its desirable properties and yet gives direct control over the scheduling decisions that determine performance. Koika has a novel and deterministic operational semantics that uses dynamic analysis to avoid concurrency anomalies. Our implementation includes Coq definitions of syntax, semantics, key metatheorems, and a verified compiler to circuits. We argue that most of the extra circuitry required for dynamic analysis can be eliminated by compile-time BSV-style static analysis.
FNC-2 is a new attribute grammar processing system aiming at expressive power, efficiency, ease of use and versatility. Its development at INRIA started in 1986, and a first running prototype is available since early ...
详细信息
ISBN:
(纸本)0897913647
FNC-2 is a new attribute grammar processing system aiming at expressive power, efficiency, ease of use and versatility. Its development at INRIA started in 1986, and a first running prototype is available since early 1989. Its most important features are: efficient exhaustive and incremental visit-sequence-based evaluation of strongly (absolutely) non-circular AGs;extensive space optimizations;a specially-designed AG-description language, with provisions for true modularity;portability and versatility of the generated evaluators;complete environment for application development. This paper briefly describes the design and implementation of FNC-2 and its peripherals. Then preliminary experience with the system is reported.
We present mechanisms that enable our compiler-target language, C--, to express four of the best known techniques for implementing exceptions, all within a single, uniform framework. We define the mechanisms precisely...
详细信息
We present mechanisms that enable our compiler-target language, C--, to express four of the best known techniques for implementing exceptions, all within a single, uniform framework. We define the mechanisms precisely, using a formal operational semantics. We also show that exceptions need not require special treatment in the optimizer;by introducing extra dataflow edges, we make standard optimization techniques work even on programs that use exceptions. Our approach clarifies the design space of exception-handling techniques, and it allows a single optimizer to handle a variety of implementation techniques. Our ultimate goal is to allow a source-language compiler the freedom to choose its exception-handling policy, while encapsulating the architecture-dependent mechanisms and their optimization in an implementation of C-- that can be used by compilers for many source languages.
Object-oriented applications may contain data members that can be removed from the application without affecting program behavior. Such `dead' data members may occur due to unused functionality in class libraries,...
详细信息
Object-oriented applications may contain data members that can be removed from the application without affecting program behavior. Such `dead' data members may occur due to unused functionality in class libraries, or due to the programmer losing track of member usage as the application changes over time. We present a simple and efficient algorithm for detecting dead data members in C++ applications. This algorithm has been implemented using a prototype version of the IBM VisualAge C++ compiler, and applied to a number of realistic benchmark programs ranging from 600 to 58,000 lines of code. For the non-trivial benchmarks, we found that up to 27.3% of the data members in the benchmarks are dead (average 12.5%), and that up to 11.6% of the object space of these applications may be occupied by dead data members at run-time (average 4.4%).
暂无评论