Providing feedback on programming assignments is a tedious task for the instructor, and even impossible in large Massive Open Online Courses withthousands of students. Previous research has suggested that program rep...
详细信息
ISBN:
(纸本)9781450356985
Providing feedback on programming assignments is a tedious task for the instructor, and even impossible in large Massive Open Online Courses withthousands of students. Previous research has suggested that program repair techniques can be used to generate feedback in programming education. In this paper, we present a novel fully automated program repair algorithm for introductory programming assignments. the key idea of the technique, which enables automation and scalability, is to use the existing correct student solutions to repair the incorrect attempts. We evaluate the approach in two experiments: (I) We evaluate the number, size and quality of the generated repairs on 4,293 incorrect student attempts from an existing MOOC. We find that our approach can repair 97% of student attempts, while 81% of those are small repairs of good quality. (II) We conduct a preliminary user study on performance and repair usefulness in an interactive teaching setting. We obtain promising initial results (the average usefulness grade 3.4 on a scale from 1 to 5), and conclude that our approach can be used in an interactive setting.
With an ever-growing amount of collected data, the importance of visualization as an analysis component is growing in concert. the creation of good visualizations often doesn't happen in one step but is rather an ...
详细信息
ISBN:
(纸本)9781450360456
With an ever-growing amount of collected data, the importance of visualization as an analysis component is growing in concert. the creation of good visualizations often doesn't happen in one step but is rather an iterative and exploratory process. However, this process is currently not well supported in most of the available visualization tools and systems. Visualization authors are forced to commit prematurely to particular design aspects of their creations, and when exploring potential variant visualizations, they are forced to adopt ad hoc techniques such as copying code snippets or keeping a collection of separate files. We propose variational visualizations as a model supporting open-ended exploration of the design space of information visualization. Together withthat model, we present a prototype implementation in the form of a domain-specific language embedded in Purescript.
We present a novel approach for approximate sampling in probabilistic programs based on incremental inference. the key idea is to adapt the samples for a program P into samples for a program Q, thereby avoiding the ex...
详细信息
ISBN:
(纸本)9781450356985
We present a novel approach for approximate sampling in probabilistic programs based on incremental inference. the key idea is to adapt the samples for a program P into samples for a program Q, thereby avoiding the expensive sampling computation for program Q. To enable incremental inference in probabilistic programming, our work: (i) introduces the concept of a trace translator which adapts samples from P into samples of Q, (ii) phrases this translation approach in the context of sequential Monte Carlo (SMC), which gives theoretical guarantees that the adapted samples converge to the distribution induced by Q, and (iii) shows how to obtain a concrete trace translator by establishing a correspondence between the random choices of the two probabilistic programs. We implemented our approach in two different probabilistic programming systems and showed that, compared to methods that sample the program Q from scratch, incremental inference can lead to orders of magnitude increase in efficiency, depending on how closely related P and Q are.
Industry is increasingly turning to reconfigurable architectures like FPGAs and CGRAs for improved performance and energy efficiency. Unfortunately, adoption of these architectures has been limited by their programmin...
详细信息
ISBN:
(纸本)9781450356985
Industry is increasingly turning to reconfigurable architectures like FPGAs and CGRAs for improved performance and energy efficiency. Unfortunately, adoption of these architectures has been limited by their programming models. HDLs lack abstractions for productivity and are difficult to target from higher level languages. HLS tools are more productive, but offer an ad-hoc mix of software and hardware abstractions which make performance optimizations difficult. In this work, we describe a new domain-specific language and compiler called Spatial for higher level descriptions of application accelerators. We describe Spatial's hardware-centric abstractions for both programmer productivity and design performance, and summarize the compiler passes required to support these abstractions, including pipeline scheduling, automatic memory banking, and automated design tuning driven by active machine learning. We demonstrate the language's ability to target FPGAs and CGRAs from common source code. We show that applications written in Spatial are, on average, 42% shorter and achieve a mean speedup of 2.9x over SDAccel HLS when targeting a Xilinx UltraScale+ VU9P FPGA on an Amazon EC2 F1 instance.
this paper presents a new static analysis for deriving upper bounds on the expected resource consumption of probabilistic programs. the analysis is fully automatic and derives symbolic bounds that are multivariate pol...
详细信息
ISBN:
(纸本)9781450356985
this paper presents a new static analysis for deriving upper bounds on the expected resource consumption of probabilistic programs. the analysis is fully automatic and derives symbolic bounds that are multivariate polynomials in the inputs. the new technique combines manual state-of-the-art reasoning techniques for probabilistic programs with an effective method for automatic resource-bound analysis of deterministic programs. It can be seen as both, an extension of automatic amortized resource analysis (AARA) to probabilistic programs and an automation of manual reasoning for probabilistic programs that is based on weakest preconditions. An advantage of the technique is that it combines the clarity and compositionality of a weakest-precondition calculus withthe efficient automation of AARA. As a result, bound inference can be reduced to off-the-shelf LP solving in many cases and automatically-derived bounds can be interactively extended with standard program logics if the automation fails. Building on existing work, the soundness of the analysis is proved with respect to an operational semantics that is based on Markov decision processes. the effectiveness of the technique is demonstrated with a prototype implementationthat is used to automatically analyze 39 challenging probabilistic programs and randomized algorithms. Experiments indicate that the derived constant factors in the bounds are very precise and even optimal for some programs.
A classic problem in parallel computing is to take a high-level parallel program written, for example, in nested-parallel style with fork-join constructs and run it efficiently on a real machine. the problem could be ...
详细信息
ISBN:
(纸本)9781450356985
A classic problem in parallel computing is to take a high-level parallel program written, for example, in nested-parallel style with fork-join constructs and run it efficiently on a real machine. the problem could be considered solved in theory, but not in practice, because the overheads of creating and managing parallel threads can overwhelm their benefits. Developing efficient parallel codes therefore usually requires extensive tuning and optimizations to reduce parallelism just to a point where the overheads become acceptable. In this paper, we present a scheduling technique that delivers provably efficient results for arbitrary nested-parallel programs, without the tuning needed for controlling parallelism overheads. the basic idea behind our technique is to create threads only at a beat (which we refer to as the "heartbeat") and make sure to do useful work in between. We specify our heartbeat scheduler using an abstract-machine semantics and provide mechanized proofs that the scheduler guarantees low overheads for all nested parallel programs. We present a prototype C++ implementation and an evaluation that shows that Heartbeat competes well with manually optimized Cilk Plus codes, without requiring manual tuning.
programming concurrent, distributed systems is hard-especially when these systems mutate shared, persistent state replicated at geographic scale. To enable high availability and scalability, a new class of weakly cons...
详细信息
ISBN:
(纸本)9781450356985
programming concurrent, distributed systems is hard-especially when these systems mutate shared, persistent state replicated at geographic scale. To enable high availability and scalability, a new class of weakly consistent data stores has become popular. However, some data needs strong consistency. To manipulate both weakly and strongly consistent data in a single transaction, we introduce a new abstraction: mixed-consistency transactions, embodied in a new embedded language, MixT. Programmers explicitly associate consistency models with remote storage sites;each atomic, isolated transaction can access a mixture of data with different consistency models. Compile-time information-flow checking, applied to consistency models, ensures that these models are mixed safely and enables the compiler to automatically partition transactions. New run-time mechanisms ensure that consistency models can also be mixed safely, even when the data used by a transaction resides on separate, mutually unaware stores. Performance measurements show that despite their stronger guarantees, mixed-consistency transactions retain much of the speed of weak consistency, significantly outperforming traditional serializable transactions.
Predicting program properties such as names or expression types has a wide range of applications. It can ease the task of programming, and increase programmer productivity. A major challenge when learning from program...
详细信息
ISBN:
(纸本)9781450356985
Predicting program properties such as names or expression types has a wide range of applications. It can ease the task of programming, and increase programmer productivity. A major challenge when learning from programs is how to represent programs in a way that facilitates effective learning. We present a general path-based representation for learning from programs. Our representation is purely syntactic and extracted automatically. the main idea is to represent a program using paths in its abstract syntax tree (AST). this allows a learning model to leverage the structured nature of code rather than treating it as a flat sequence of tokens. We showthat this representation is general and can: (i) cover different prediction tasks, (ii) drive different learning algorithms (for both generative and discriminative models), and (iii) work across different programminglanguages. We evaluate our approach on the tasks of predicting variable names, method names, and full types. We use our representation to drive both CRF-based and word2vec-based learning, for programs of four languages: JavaScript, Java, Python and C#. Our evaluation shows that our approach obtains better results than task-specific handcrafted representations across different tasks and programminglanguages.
Nascent persistent memory (PM) technologies promise the performance of DRAM withthe durability of disk, but how best to integrate them into programming systems remains an open question. Recent work extends language m...
详细信息
ISBN:
(纸本)9781450356985
Nascent persistent memory (PM) technologies promise the performance of DRAM withthe durability of disk, but how best to integrate them into programming systems remains an open question. Recent work extends language memory models with a persistency model prescribing semantics for updates to PM. these semantics enable programmers to design data structures in PM that are accessed like memory and yet are recoverable upon crash or failure. Alas, we find the semantics and performance of existing approaches unsatisfying. Existing approaches require high-overhead mechanisms, are restricted to certain synchronization constructs, provide incomplete semantics, and/or may recover to state that cannot arise in fault-free execution. We propose persistency semantics that guarantee failure atomicity of synchronization-free regions (SFRs) -program regions delimited by synchronization operations. Our approach provides clear semantics for the PM state recovery code may observe and extends C++11's "sequential consistency for data-race-free" guarantee to post-failure recovery code. We investigate two designs for failure-atomic SFRs that vary in performance and the degree to which commit of persistent state may lag execution. We demonstrate both approaches in LLVM v3.6.0 and compare to a state-of-the-art baseline to show performance improvement up to 87.5% (65.5% avg).
An immutable multi-map is a many-to-many map data structure with expected fast insert and lookup operations. this data structure is used for applications processing graphs or many-to-many relations as applied in compi...
详细信息
ISBN:
(纸本)9781450356985
An immutable multi-map is a many-to-many map data structure with expected fast insert and lookup operations. this data structure is used for applications processing graphs or many-to-many relations as applied in compilers, runtimes of programminglanguages, or in static analysis of object-oriented systems. Collection data structures are assumed to carefully balance execution time of operations with memory consumption characteristics and need to scale gracefully from a few elements to multiple gigabytes at least. When processing larger in-memory data sets the overhead of the data structure encoding itself becomes a memory usage bottleneck, dominating the overall performance. In this paper we propose AXIOM, a novel hash-trie data structure that allows for a highly efficient and type-safe multi-map encoding by distinguishing inlined values of singleton sets from nested sets of multi-mappings. AXIOM strictly generalizes over previous hash-trie data structures by supporting the processing of fine-grained type-heterogeneous content on the implementation level (while API and language support for type-heterogeneity are not scope of this paper). We detail the design and optimizations of AXIOM and further compare it against state-of-the-art immutable maps and multi-maps in Java, Scala and Clojure. We isolate key differences using microbenchmarks and validate the resulting conclusions on a case study in static analysis. AXIOM reduces the key-value storage overhead by 1.87 x;with specializing and inlining across collection boundaries it improves by 5.1 x.
暂无评论