The most intuitive memory model for shared-memory multi-threaded programming is sequential consistency (SC), but it disallows the use of many compiler and hardware optimizations thereby impacting performance. Data-rac...
详细信息
ISBN:
(纸本)9781450300193
The most intuitive memory model for shared-memory multi-threaded programming is sequential consistency (SC), but it disallows the use of many compiler and hardware optimizations thereby impacting performance. Data-race-free (DRF) models, such as the proposed C++0x memory model, guarantee SC execution for data-race-free programs. But these models provide no guarantee at all for racy programs, compromising the safety and debuggability of such programs. To address the safety issue, the Java memory model, which is also based on the DRF model, provides a weak semantics for racy executions. However, this semantics is subtle and complex, making it difficult for programmers to reason about their programs and for compiler writers to ensure the correctness of compiler optimizations. We present the DRFx memory model, which is simple for programmers to understand and use while still supporting many common optimizations. We introduce a memory model (MM) exception which can be signaled to halt execution. If a program executes without throwing this exception, then DRFx guarantees that the execution is SC. If a program throws an MM exception during an execution, then DRFx guarantees that the program has a data race. We observe that SC violations can be detected in hardware through a lightweight form of conflict detection. Furthermore, our model safely allows aggressive compiler and hardware optimizations within compiler-designated program regions. We formalize our memory model, prove several properties about this model, describe a compiler and hardware design suitable for DRFx, and evaluate the performance overhead due to our compiler and hardware requirements.
Syntax definitions are pervasive in modern software systems, and serve as the basis for language processing tools like parsers and compilers. Mainstream parser generators pose restrictions on syntax definitions that f...
详细信息
ISBN:
(纸本)9781450302036
Syntax definitions are pervasive in modern software systems, and serve as the basis for language processing tools like parsers and compilers. Mainstream parser generators pose restrictions on syntax definitions that follow from their implementation algorithm. They hamper evolution, maintainability, and compositionality of syntax definitions. The pureness and declarativity of syntax definitions is lost. We analyze how these problems arise for different aspects of syntax definitions, discuss their consequences for language engineers, and show how the pure and declarative nature of syntax definitions can be regained.
We present smooth interpretation, a method to systematically approximate numerical imperative programs by smooth mathematical functions. This approximation facilitates the use of numerical search techniques like gradi...
详细信息
ISBN:
(纸本)9781450300193
We present smooth interpretation, a method to systematically approximate numerical imperative programs by smooth mathematical functions. This approximation facilitates the use of numerical search techniques like gradient descent for program analysis and synthesis. The method extends to programs the notion of Gaussian smoothing, a popular signal-processing technique that filters out noise and discontinuities from a signal by taking its convolution with a Gaussian function. In our setting, Gaussian smoothing executes a program according to a probabilistic semantics;the execution of program P on an input x after Gaussian smoothing can be summarized as follows: (1) Apply a Gaussian perturbation to x-the perturbed input is a random variable following a normal distribution with mean x. (2) Compute and return the expected output of P on this perturbed input. Computing the expectation explicitly would require the execution of P on all possible inputs, but smooth interpretation bypasses this requirement by using a form of symbolic execution to approximate the effect of Gaussian smoothing on P. The result is an efficient but approximate implementation of Gaussian smoothing of programs. Smooth interpretation has the effect of attenuating features of a program that impede numerical searches of its input space-for example, discontinuities resulting from conditional branches are replaced by continuous transitions. We apply smooth interpretation to the problem of synthesizing values of numerical control parameters in embedded control applications. This problem is naturally formulated as one of numerical optimization: the goal is to find parameter values that minimize the error between the resulting program and a programmer-provided behavioral specification. Solving this problem by directly applying numerical optimization techniques is often impractical due to the discontinuities in the error function. By eliminating these discontinuities, smooth interpretation makes it possible to
An ad hoc data format is any nonstandard, semi-structured data format for which robust data processing tools are not easily available. In this paper, we present ANNE, a new kind of markup languagedesigned to help use...
详细信息
ISBN:
(纸本)9781450300193
An ad hoc data format is any nonstandard, semi-structured data format for which robust data processing tools are not easily available. In this paper, we present ANNE, a new kind of markup languagedesigned to help users generate documentation and data processing tools for ad hoc text data. More specifically, given a new ad hoc data source, an ANNE programmer edits the document to add a number of simple annotations, which serve to specify its syntactic structure. Annotations include elements that specify constants, optional data, alternatives, enumerations, sequences, tabular data, and recursive patterns. The ANNE system uses a combination of user annotations and the raw data itself to extract a context-free grammar from the document. This context-free grammar can then be used to parse the data and transform it into an XML parse tree, which may be viewed through a browser for analysis or debugging purposes. In addition, the ANNE system generates a PADS/ML description [19], which may be saved as lasting documentation of the data format or compiled into a host of useful data processing tools. In addition to designing and implementing ANNE, we have devised a semantic theory for the core elements of the language. This semantic theory describes the editing process, which translates a raw, unannotated text document into an annotated document, and the grammar extraction process, which generates a context-free grammar from an annotated document. We also present an alternative characterization of system behavior by drawing upon ideas from the field of relevance logic. This secondary characterization, which we call relevance analysis, specifies a direct relationship between unannotated documents and the context-free grammars that our system can generate from them. Relevance analysis allows us to prove important theorems concerning the expressiveness and utility of our system.
programminglanguage specifications mandate static and dynamic analyses to preclude syntactic and semantic errors. Although individual languages are usually well-specified, composing languages is not, and this poor sp...
详细信息
ISBN:
(纸本)9781450300193
programminglanguage specifications mandate static and dynamic analyses to preclude syntactic and semantic errors. Although individual languages are usually well-specified, composing languages is not, and this poor specification is a source of many errors in multilingual programs. For example, virtually all Java programs compose Java and C using the Java Native Interface (JNI). Since JNI is informally specified, developers have difficulty using it correctly, and current Java compilers and virtual machines (VMs) inconsistently check only a subset of JNI constraints. This paper's most significant contribution is to show how to synthesize dynamic analyses from state machines to detect foreign function interface (FFI) violations. We identify three classes of FFI constraints encoded by eleven state machines that capture thousands of JNI and Python/C FFI rules. We use a mapping function to specify which state machines, transitions, and program entities (threads, objects, references) to check at each FFI call and return. From this function, we synthesize a context-specific dynamic analysis to find FFI bugs. We build bug detection tools for JNI and Python/C using this approach. For JNI, we dynamically and transparently interpose the analysis on Java and C language transitions through the JVM tools interface. The resulting tool, called Jinn, is compiler and virtual machine independent. It detects and diagnoses a wide variety of FFI bugs that other tools miss. This approach greatly reduces the annotation burden by exploiting common FFI constraints: whereas the generated Jinn code is 22,000+ lines, we wrote only 1,400 lines of state machine and mapping code. Overall, this paper lays the foundation for a more principled approach to developing correct multilingual software and a more concise and automated approach to FFI specification.
While iterative optimization has become a popular compiler optimization approach, it is based on a premise which has never been truly evaluated: that it is possible to learn the best compiler optimizations across data...
详细信息
ISBN:
(纸本)9781450300193
While iterative optimization has become a popular compiler optimization approach, it is based on a premise which has never been truly evaluated: that it is possible to learn the best compiler optimizations across data sets. Up to now, most iterative optimization studies find the best optimizations through repeated runs on the same data set. Only a handful of studies have attempted to exercise iterative optimization on a few tens of data sets. In this paper, we truly put iterative compilation to the test for the first time by evaluating its effectiveness across a large number of data sets. We therefore compose KDataSets, a data set suite with 1000 data sets for 32 programs, which we release to the public. We characterize the diversity of KDataSets, and subsequently use it to evaluate iterative optimization. We demonstrate that it is possible to derive a robust iterative optimization strategy across data sets: for all 32 programs, we find that there exists at least one combination of compiler optimizations that achieves 86% or more of the best possible speedup across all data sets using Intel's ICC (83% for GNU's GCC). This optimal combination is program-specific and yields speedups up to 1.71 on ICC and 2.23 on GCC over the highest optimization level (-fast and -03, respectively). This finding makes the task of optimizing programs across data sets much easier than previously anticipated, and it paves the way for the practical and reliable usage of iterative optimization. Finally, we derive pre-shipping and post-shipping optimization strategies for software vendors.
Domain-specific languages (DSLs) provide high expressive power focused on a particular problem domain. They provide linguistic abstractions and specialized syntax specifically designed for a domain, allowing developer...
详细信息
ISBN:
(纸本)9781450302401
Domain-specific languages (DSLs) provide high expressive power focused on a particular problem domain. They provide linguistic abstractions and specialized syntax specifically designed for a domain, allowing developers to avoid boilerplate code and low-level implementation details. language workbenches are tools that integrate all aspects of the definition of domain-specific or general-purpose software languages and the creation of a programming environment from such a definition. To count as a language workbench, a tool needs to satisfy basic requirements for the integrated definition of syntax, semantics, and editor services, and preferably also support language extension and composition. Within these requirements there is ample room for variation in the design of a language workbench. In this tutorial, we give an introduction to the state of the art in textual DSLs and language workbenches. We discuss the main requirements and variation points in the design of language workbenches, and describe two points in the design space using two state-of-the-art language workbenches. Spoofax is an example of a parser-based language workbench, while MPS represents language workbenches based on projectional editors.
As software becomes increasingly complex and difficult to analyze, it is more and more common for developers to use high-level, type-safe, object-oriented (OO) programminglanguages and to architect systems that compr...
详细信息
ISBN:
(纸本)9781450302036
As software becomes increasingly complex and difficult to analyze, it is more and more common for developers to use high-level, type-safe, object-oriented (OO) programminglanguages and to architect systems that comprise multiple components. Different components are often implemented in different programminglanguages. In state-of-the-art multicomponent, multi-language systems, cross-component communication relies on remote procedure calls (RPC) and message passing. As components are increasingly co-located on the same physical machine to ensure high utilization of multi-core systems, there is a growing potential for using shared memory for cross-language cross-runtime communication. We present the design and implementation of Co-Located Runtime Sharing (CoLoRS), a system that enables cross-language, cross-runtime type-safe, transparent shared memory. CoLoRS provides object sharing for co-located OO runtimes for both static and dynamic languages. CoLoRS defines a language-neutral object/class model, which is a static-dynamic hybrid and enables class evolution while maintaining the space/time efficiency of a static model. CoLoRS uses type mapping and class versioning to transparently map shared types to private types. CoLoRS also contributes a synchronization mechanism and a parallel, concurrent, on-thefly GC algorithm, both designed to facilitate cross-language cross-runtime object sharing. We implement CoLoRS in open-source, production-quality runtimes for Python and Java. Our empirical evaluation shows that CoLoRS extensions impose low overhead. We also investigate RPC over CoLoRS and find that using shared memory to implement co-located RPC significantly improves both communication throughput and latency by avoiding data structure serialization.
Atomic regions are an important concept in correct concurrent programming: since atomic regions can be viewed as having executed in a single step, atomicity greatly reduces the number of possible interleavings the pro...
详细信息
ISBN:
(纸本)9781450302036
Atomic regions are an important concept in correct concurrent programming: since atomic regions can be viewed as having executed in a single step, atomicity greatly reduces the number of possible interleavings the programmer needs to consider. This paper describes a method for building atomicity into a programminglanguage in an organic fashion. We take the view that atomicity holds for whole threads by default, and a division into smaller atomic regions occurs only at points where an explicit need for sharing is needed and declared. A corollary of this view is every line of code is part of some atomic region. We define a polymorphic type system, Task Types, to enforce most of the desired atomicity properties statically. We show the reasonableness of our type system by proving that type soundness, isolation invariance, and atomicity enforcement properties hold at run time. We also present initial results of a Task Types implementation built on Java.
Tracing just-in-time compilers (TJITs) determine frequently executed traces (hot paths and loops) in running programs and focus their optimization effort by emitting optimized machine code specialized to these traces....
详细信息
ISBN:
(纸本)9781450302036
Tracing just-in-time compilers (TJITs) determine frequently executed traces (hot paths and loops) in running programs and focus their optimization effort by emitting optimized machine code specialized to these traces. Prior work has established this strategy to be especially beneficial for dynamic languages such as JavaScript, where the TJIT interfaces with the interpreter and produces machine code from the JavaScript trace. This direct coupling with a JavaScript interpreter makes it difficult to harness the power of a TJIT for other components that are not written in JavaScript, e. g., the DOM implementation or the layout engine inside a browser. Furthermore, if a TJIT is tied to a particular high-level language interpreter, it is difficult to reuse it for other input languages as the optimizations are likely targeted at specific idioms of the source language. To address these issues, we designed and implemented a TJIT for Microsoft's Common Intermediate language CIL (the target language of C#, VisualBasic, F#, and many other languages). Working on CIL enables TJIT optimizations for any program compiled to this platform. In addition, to validate that the performance gains of a TJIT for JavaScript do not depend on specific idioms of JavaScript that are lost in the translation to CIL, we provide a performance evaluation of our JavaScript runtime which translates JavaScript to CIL and then runs on top of our CIL TJIT.
暂无评论