We address the problem of synthesizing code completions for programs using APIs. Given a program with holes, we synthesize completions for holes with the most likely sequences of method calls. Our main idea is to redu...
详细信息
ISBN:
(纸本)9781450327848
We address the problem of synthesizing code completions for programs using APIs. Given a program with holes, we synthesize completions for holes with the most likely sequences of method calls. Our main idea is to reduce the problem of code completion to a natural-language processing problem of predicting probabilities of sentences. We design a simple and scalable static analysis that extracts sequences of method calls from a large codebase, and index these into a statistical language model. We then employ the language model to find the highest ranked sentences, and use them to synthesize a code completion. Our approach is able to synthesize sequences of calls across multiple objects together with their arguments. Experiments show that our approach is fast and effective. Virtually all computed completions typecheck, and the desired completion appears in the top 3 results in 90% of the cases.
Reasoning about string variables, in particular program inputs, is an important aspect of many program analyses and testing frameworks. Program inputs invariably arrive as strings, and are often manipulated using high...
详细信息
ISBN:
(纸本)9781605583921
Reasoning about string variables, in particular program inputs, is an important aspect of many program analyses and testing frameworks. Program inputs invariably arrive as strings, and are often manipulated using high-level string operations such as equality checks, regular expression matching, and string concatenation. It is difficult to reason about these operations because they are not well-integrated into current constraint solvers. We present a decision procedure that solves systems of equations over regular language variables. Given such a system of constraints, our algorithm finds satisfying assignments for the variables in the system. We define this problem formally and render a mechanized correctness proof of the core of the algorithm. We evaluate its scalability and practical utility by applying it to the problem of automatically finding inputs that cause SQL injection vulnerabilities.
We present a new limited form of interprocedural analysis called field analysis that can be used by a compiler to reduce the costs of modern language features such as object-oriented programming, automatic memory mana...
详细信息
We present a new limited form of interprocedural analysis called field analysis that can be used by a compiler to reduce the costs of modern language features such as object-oriented programming, automatic memory management, and run-time checks required for type safety. Unlike many previous interprocedural analyses, our analysis is cheap, and does not require access to the entire program. Field analysis exploits the declared access restrictions placed on fields in a modular language (e.g. field access modifiers in Java) in order to determine useful properties of fields of an object, We describe our implementation of field analysis in the Swift optimizing compiler for Java, as well a set of optimizations that exploit the results of field analysis. These optimizations include removal of run-time tests, compile-time resolution of method calls, object inlining, removal of unnecessary synchronization, and stack allocation. Our results demonstrate that field analysis is efficient and effective. Speedups average 7% on a wide range of applications, with some times reduced by up to 27%. Compile time overhead of field analysis is about 10%.
We present an extension of field analysis (see [4]) called related field analysis which is a general technique for proving relationships between two or more fields of an object. We demonstrate the feasibility and appl...
详细信息
ISBN:
(纸本)9781581134148
We present an extension of field analysis (see [4]) called related field analysis which is a general technique for proving relationships between two or more fields of an object. We demonstrate the feasibility and applicability of related field analysis by applying it to the problem of removing array bounds checks. For array bounds check removal, we define a pair of related fields to be an integer field and an array field for which the integer field has a known relationship to the length of the array. This related field information can then be used to remove array bounds checks from accesses to the array field. Our results show that related field analysis can remove an average of 50% of the dynamic array bounds checks on-a wide range of applications. We describe the implementation of related field analysis in the Swift optimizing compiler for Java, as well as the optimizations that exploit the results of related field analysis.
Syntactic sugar is pervasive in language technology. It is used to shrink the size of a core language;to define domain-specific languages;and even to let programmers extend their language. Unfortunately, syntactic sug...
详细信息
ISBN:
(纸本)9781450327848
Syntactic sugar is pervasive in language technology. It is used to shrink the size of a core language;to define domain-specific languages;and even to let programmers extend their language. Unfortunately, syntactic sugar is eliminated by transformation, so the resulting programs become unfamiliar to authors. Thus, it comes at a price: it obscures the relationship between the user's source program and the program being evaluated. We address this problem by showing how to compute reduction steps in terms of the surface syntax. Each step in the surface language emulates one or more steps in the core language. The computed steps hide the transformation, thus maintaining the abstraction provided by the surface language. We make these statements about emulation and abstraction precise, prove that they hold in our formalism, and verify part of the system in Coq. We have implemented this work and applied it to three very different languages.
The design and implementation of a compiler that translates programs written in a type-safe subset of the C programminglanguage into highly optimized DEC Alpha assembly language programs and a certifier that automati...
详细信息
The design and implementation of a compiler that translates programs written in a type-safe subset of the C programminglanguage into highly optimized DEC Alpha assembly language programs and a certifier that automatically checks the type safety and memory safety of any assembly language program produced by the compiler is presented. The result of the certifier is either a formal proof of type safety or a counterexample pointing to a potential violation of the type system by the target program. The ensemble of the compiler and the certifier is called a certifying compiler.
This paper proposes to combine two seemingly opposed programming models for building massively concurrent network services: the event-driven model and the multithreaded model. The result is a hybrid design that offers...
详细信息
ISBN:
(纸本)9781595936332
This paper proposes to combine two seemingly opposed programming models for building massively concurrent network services: the event-driven model and the multithreaded model. The result is a hybrid design that offers the best of both worlds-the ease of use and expressiveness of threads and the flexibility and performance of events. This paper shows how the hybrid model can be implemented entirely at the application level using concurrency monads in Haskell, which provides type-safe abstractions for both events and threads. This approach simplifies the development of massively concurrent software in a way that scales to real-world network services. The Haskell implementation supports exceptions, symmetrical multiprocessing, software transactional memory, asynchronous I/O mechanisms and application-level network protocol stacks. Experimental results demonstrate that this monad-based approach has good performance: the threads are extremely lightweight (scaling to ten million threads), and the I/O performance compares favorably to that of Linux NPTL.
PADS is a declarative data description language that allows data analysts to describe both the physical layout of ad hoc data sources and semantic properties of that data. From such descriptions, the PADS compiler gen...
详细信息
PADS is a declarative data description language that allows data analysts to describe both the physical layout of ad hoc data sources and semantic properties of that data. From such descriptions, the PADS compiler generates libraries and tools for manipulating the data, including parsing routines, statistical profiling tools, translation programs to produce well-behaved formats such as XML or those required for loading relational databases, and tools for running XQueries over raw PADS data sources. The descriptions are concise enough to serve as "living" documentation while flexible enough to describe most of the ASCII, binary, and Cobol formats that we have seen in practice. The generated parsing library provides for robust, application-specific error handling.
We present FLIX, a declarative programminglanguage for specifying and solving least fixed point problems, particularly static program analyses. FLIX is inspired by Datalog and extends it with lattices and monotone fu...
详细信息
ISBN:
(纸本)9781450342612
We present FLIX, a declarative programminglanguage for specifying and solving least fixed point problems, particularly static program analyses. FLIX is inspired by Datalog and extends it with lattices and monotone functions. Using FLIX, implementors of static analyses can express a broader range of analyses than is currently possible in pure Datalog, while retaining its familiar rule-based syntax. We define a model-theoretic semantics of FLIX as a natural extension of the Datalog semantics. This semantics captures the declarative meaning of FLIX programs without imposing any specific evaluation strategy. An efficient strategy is semi-naive evaluation which we adapt for FLIX. We have implemented a compiler and runtime for FLIX, and used it to express several well-known static analyses, including the IFDS and IDE algorithms. The declarative nature of FLIX clearly exposes the similarity between these two algorithms.
Recent work on alias analysis in the presence of pointers has concentrated on context-sensitive interprocedural analyses, which treat multiple calls to a single procedure independently rather than constructing a singl...
详细信息
ISBN:
(纸本)9780897916974
Recent work on alias analysis in the presence of pointers has concentrated on context-sensitive interprocedural analyses, which treat multiple calls to a single procedure independently rather than constructing a single approximation to a procedure's effect on all of its callers. While context-sensitive modeling offers the potential for greater precision by considering only realizable call-return paths, its empirical benefits have yet to be measured. This paper compares the precision of a simple, efficient, context-insensitive points-to analysis for the C programminglanguage with that of a maximally context-sensitive version of the same analysis. We demonstrate that, for a number of pointer-intensive benchmark programs, context-insensitivity exerts little to no precision penalty. We also describe techniques for using the output of context-insensitive analysis to improve the efficiency of context-sensitive analysis without affecting precision.
暂无评论