We present an approach to enriching the type system of ML with a restricted form of dependent types, where type index objects are drawn from a constraint domain C, leading to the DML(C) language schema. this allows sp...
详细信息
We present an approach to enriching the type system of ML with a restricted form of dependent types, where type index objects are drawn from a constraint domain C, leading to the DML(C) language schema. this allows specification and inference of significantly more precise type information, facilitating program error detection and compiler optimization. A major complication resulting from introducing dependent types is that pure type inference for the enriched system is no longer possible, but we show that type-checking a sufficiently annotated program in DML(C) can be reduced to constraint satisfaction in the constraint domain C. We exhibit the unobtrusiveness of our approach through practical examples and prove that DML(C) is conservative over ML. the main contribution of the paper lies in our languagedesign, including the formulation of type-checking rules which makes the approach practical. To our knowledge, no previous type system for a general purpose programminglanguage such as ML has combined dependent types with features including datatype declarations, higher-order functions, general recursions, let-polymorphism, mutable references, and exceptions. In addition, we have finished a prototype implementation of DML(C) for an integer constraint domain C, where constraints are linear inequalities (Xi and Pfenning 1998).
To understand diverse natural language commands, virtual assistants today are trained with numerous labor-intensive, manually annotated sentences. this paper presents a methodology and the Genie toolkit that can handl...
详细信息
ISBN:
(纸本)9781450367127
To understand diverse natural language commands, virtual assistants today are trained with numerous labor-intensive, manually annotated sentences. this paper presents a methodology and the Genie toolkit that can handle new compound commands with significantly less manual effort. We advocate formalizing the capability of virtual assistants with a Virtual Assistant programminglanguage (VAPL) and using a neural semantic parser to translate natural language into VAPL code. Genie needs only a small realistic set of input sentences for validating the neural model. Developers write templates to synthesize data;Genie uses crowdsourced paraphrases and data augmentation, along withthe synthesized data, to train a semantic parser. We also propose design principles that make VAPL languages amenable to natural language translation. We apply these principles to revise thingTalk, the language used by the Almond virtual assistant. We use Genie to build the first semantic parser that can support compound virtual assistants commands with unquoted free-form parameters. Genie achieves a 62% accuracy on realistic user inputs. We demonstrate Genie's generality by showing a 19% and 31% improvement over the previous state of the art on a music skill, aggregate functions, and access control.
Subtle concurrency errors in multithreaded libraries that arise because of incorrect or inadequate synchronization are often difficult to pinpoint precisely using only static techniques. On the other hand, the effecti...
详细信息
ISBN:
(纸本)9781450334686
Subtle concurrency errors in multithreaded libraries that arise because of incorrect or inadequate synchronization are often difficult to pinpoint precisely using only static techniques. On the other hand, the effectiveness of dynamic race detectors is critically dependent on multithreaded test suites whose execution can be used to identify and trigger races. Usually, such multithreaded tests need to invoke a specific combination of methods with objects involved in the invocations being shared appropriately to expose a race. Without a priori knowledge of the race, construction of such tests can be challenging. In this paper, we present a lightweight and scalable technique for synthesizing precisely these kinds of tests. Given a multithreaded library and a sequential test suite, we describe a fully automated analysis that examines sequential execution traces, and produces as its output a concurrent client program that drives shared objects via library method calls to states conducive for triggering a race. Experimental results on a variety of well-tested Java libraries yield 101 synthesized multithreaded tests in less than four minutes. Analyzing the execution of these tests using an off-the-shelf race detector reveals 187 harmful races, including several previously unreported ones.
We present a new type system combining occurrence typing- a technique previously used to type check programs in dynamically-typed languages such as Racket, Clojure, and JavaScript-with dependent refinement types. We d...
详细信息
ISBN:
(纸本)9781450342612
We present a new type system combining occurrence typing- a technique previously used to type check programs in dynamically-typed languages such as Racket, Clojure, and JavaScript-with dependent refinement types. We demonstrate that the addition of refinement types allows the integration of arbitrary solver-backed reasoning about logical propositions from external theories. By building on occurrence typing, we can add our enriched type system as a natural extension of Typed Racket, reusing its core while increasing its expressiveness. the result is a well-tested type system with a conservative, decidable core in which types may depend on a small but extensible set of program terms. In addition to describing our design, we present the following: a formal model and proof of correctness;a strategy for integrating new theories, with specific examples including linear arithmetic and bitvectors;and an evaluation in the context of the full Typed Racket implementation. Specifically, we take safe vector operations as a case study, examining all vector accesses in a 56,000 line corpus of Typed Racket programs. Our system is able to prove that 50% of these are safe with no new annotations, and with a few annotations and modifications we capture more than 70%.
A classic problem in parallel computing is to take a high-level parallel program written, for example, in nested-parallel style with fork-join constructs and run it efficiently on a real machine. the problem could be ...
详细信息
ISBN:
(纸本)9781450356985
A classic problem in parallel computing is to take a high-level parallel program written, for example, in nested-parallel style with fork-join constructs and run it efficiently on a real machine. the problem could be considered solved in theory, but not in practice, because the overheads of creating and managing parallel threads can overwhelm their benefits. Developing efficient parallel codes therefore usually requires extensive tuning and optimizations to reduce parallelism just to a point where the overheads become acceptable. In this paper, we present a scheduling technique that delivers provably efficient results for arbitrary nested-parallel programs, without the tuning needed for controlling parallelism overheads. the basic idea behind our technique is to create threads only at a beat (which we refer to as the "heartbeat") and make sure to do useful work in between. We specify our heartbeat scheduler using an abstract-machine semantics and provide mechanized proofs that the scheduler guarantees low overheads for all nested parallel programs. We present a prototype C++ implementation and an evaluation that shows that Heartbeat competes well with manually optimized Cilk Plus codes, without requiring manual tuning.
We show how to combine a general purpose type system for an existing language with support for programming with binders and contexts by refining the type system of ML with a restricted form of dependent types where in...
详细信息
ISBN:
(纸本)9781450310833
We show how to combine a general purpose type system for an existing language with support for programming with binders and contexts by refining the type system of ML with a restricted form of dependent types where index objects are drawn from contextual LF. this allows the user to specify formal systems within the logical framework LF and index ML types with contextual LF objects. Our languagedesign keeps the index language generic only requiring decidability of equality of the index language providing a modular design. To illustrate the elegance and effectiveness of our language, we give programs for closure conversion and normalization by evaluation. Our three key technical contribution are: 1) We give a bidirectional type system for our core language which is centered around refinement substitutions instead of constraint solving. As a consequence, type checking is decidable and easy to trust, although constraint solving may be undecidable. 2) We give a big-step environment based operational semantics with environments which lends itself to efficient implementation. 3) We prove our language to be type safe and have mechanized our theoretical development in the proof assistant Coq using the fresh approach to binding.
Image processing pipelines combine the challenges of stencil computations and stream programs. they are composed of large graphs of different stencil stages, as well as complex reductions, and stages with global or da...
详细信息
ISBN:
(纸本)9781450320146
Image processing pipelines combine the challenges of stencil computations and stream programs. they are composed of large graphs of different stencil stages, as well as complex reductions, and stages with global or data-dependent access patterns. Because of their complex structure, the performance difference between a naive implementation of a pipeline and an optimized one is often an order of magnitude. Efficient implementations require optimization of both parallelism and locality, but due to the nature of stencils, there is a fundamental tension between parallelism, locality, and introducing redundant recomputation of shared values. We present a systematic model of the tradeoff space fundamental to stencil pipelines, a schedule representation which describes concrete points in this space for each stage in an image processing pipeline, and an optimizing compiler for the Halide image processing languagethat synthesizes high performance implementations from a Halide algorithm and a schedule. Combining this compiler with stochastic search over the space of schedules enables terse, composable programs to achieve state-of-the-art performance on a wide range of real image processing pipelines, and across different hardware architectures, including multicores with SIMD, and heterogeneous CPU+GPU execution. From simple Halide programs written in a few hours, we demonstrate performance up to 5x faster than hand-tuned C, intrinsics, and CUDA implementations optimized by experts over weeks or months, for image processing applications beyond the reach of past automatic compilers.
this paper presents a new static analysis for deriving upper bounds on the expected resource consumption of probabilistic programs. the analysis is fully automatic and derives symbolic bounds that are multivariate pol...
详细信息
ISBN:
(纸本)9781450356985
this paper presents a new static analysis for deriving upper bounds on the expected resource consumption of probabilistic programs. the analysis is fully automatic and derives symbolic bounds that are multivariate polynomials in the inputs. the new technique combines manual state-of-the-art reasoning techniques for probabilistic programs with an effective method for automatic resource-bound analysis of deterministic programs. It can be seen as both, an extension of automatic amortized resource analysis (AARA) to probabilistic programs and an automation of manual reasoning for probabilistic programs that is based on weakest preconditions. An advantage of the technique is that it combines the clarity and compositionality of a weakest-precondition calculus withthe efficient automation of AARA. As a result, bound inference can be reduced to off-the-shelf LP solving in many cases and automatically-derived bounds can be interactively extended with standard program logics if the automation fails. Building on existing work, the soundness of the analysis is proved with respect to an operational semantics that is based on Markov decision processes. the effectiveness of the technique is demonstrated with a prototype implementationthat is used to automatically analyze 39 challenging probabilistic programs and randomized algorithms. Experiments indicate that the derived constant factors in the bounds are very precise and even optimal for some programs.
Predicting program properties such as names or expression types has a wide range of applications. It can ease the task of programming, and increase programmer productivity. A major challenge when learning from program...
详细信息
ISBN:
(纸本)9781450356985
Predicting program properties such as names or expression types has a wide range of applications. It can ease the task of programming, and increase programmer productivity. A major challenge when learning from programs is how to represent programs in a way that facilitates effective learning. We present a general path-based representation for learning from programs. Our representation is purely syntactic and extracted automatically. the main idea is to represent a program using paths in its abstract syntax tree (AST). this allows a learning model to leverage the structured nature of code rather than treating it as a flat sequence of tokens. We showthat this representation is general and can: (i) cover different prediction tasks, (ii) drive different learning algorithms (for both generative and discriminative models), and (iii) work across different programminglanguages. We evaluate our approach on the tasks of predicting variable names, method names, and full types. We use our representation to drive both CRF-based and word2vec-based learning, for programs of four languages: JavaScript, Java, Python and C#. Our evaluation shows that our approach obtains better results than task-specific handcrafted representations across different tasks and programminglanguages.
the proceedings contain 8 papers. the topics discussed include: fluid data structures;detecting unsatisfiable CSS rules in the presence of DTDs;towards compiling graph queries in relational engines;streaming saturatio...
ISBN:
(纸本)9781450367189
the proceedings contain 8 papers. the topics discussed include: fluid data structures;detecting unsatisfiable CSS rules in the presence of DTDs;towards compiling graph queries in relational engines;streaming saturation for large RDF graphs with dynamic schema information;arc: an IR for batch and stream programming;on the semantics of cypher’s implicit group-by;mixing set and bag semantics;and language-integrated provenance by trace analysis.
暂无评论