To understand diverse natural language commands, virtual assistants today are trained with numerous labor-intensive, manually annotated sentences. this paper presents a methodology and the Genie toolkit that can handl...
详细信息
ISBN:
(纸本)9781450367127
To understand diverse natural language commands, virtual assistants today are trained with numerous labor-intensive, manually annotated sentences. this paper presents a methodology and the Genie toolkit that can handle new compound commands with significantly less manual effort. We advocate formalizing the capability of virtual assistants with a Virtual Assistant programminglanguage (VAPL) and using a neural semantic parser to translate natural language into VAPL code. Genie needs only a small realistic set of input sentences for validating the neural model. Developers write templates to synthesize data;Genie uses crowdsourced paraphrases and data augmentation, along withthe synthesized data, to train a semantic parser. We also propose design principles that make VAPL languages amenable to natural language translation. We apply these principles to revise thingTalk, the language used by the Almond virtual assistant. We use Genie to build the first semantic parser that can support compound virtual assistants commands with unquoted free-form parameters. Genie achieves a 62% accuracy on realistic user inputs. We demonstrate Genie's generality by showing a 19% and 31% improvement over the previous state of the art on a music skill, aggregate functions, and access control.
Adaptive software becomes more and more important as computing is increasingly context-dependent. Runtime adaptability can be achieved by dynamically selecting and applying context-specific code. Role-oriented program...
详细信息
ISBN:
(纸本)9781450369817
Adaptive software becomes more and more important as computing is increasingly context-dependent. Runtime adaptability can be achieved by dynamically selecting and applying context-specific code. Role-oriented programming has been proposed as a paradigm to enable runtime adaptive software by design. Roles change the objects' behavior at runtime and thus allow adapting the software to a given context. However, this increased variability and expressiveness has a direct impact on performance and memory consumption. We found a high overhead in the steady-state performance of executing compositions of adaptations. this paper presents a new approach to use run-time information to construct a dispatch plan that can be executed efficiently by the JVM. the concept of late binding is extended to dynamic function compositions. We evaluated the implementation with a benchmark for role-oriented programminglanguages leveraging context-dependent role semantics achieving a mean speedup of 2.79x over the regular implementation.
In a typical data-processing program, the representation of data in memory is distinct from its representation in a serialized form on disk. the former has pointers and arbitrary, sparse layout, facilitating easy mani...
详细信息
ISBN:
(纸本)9781450367127
In a typical data-processing program, the representation of data in memory is distinct from its representation in a serialized form on disk. the former has pointers and arbitrary, sparse layout, facilitating easy manipulation by a program, while the latter is packed contiguously, facilitating easy I/O. We propose a language, LoCal, to unify in-memory and serialized formats. LoCal extends a region calculus into a location calculus, employing a type system that tracks the byte-addressed layout of all heap values. We formalize LoCal and prove type safety, and show how LoCal programs can be inferred from unannotated source terms. We transform the existing Gibbon compiler to use LoCal as an intermediate language, withthe goal of achieving a balance between code speed and data compactness by introducing just enough indirection into heap layouts, preserving the asymptotic complexity of traditional representations, but working with mostly or completely serialized data. We show that our approach yields significant performance improvement over prior approaches to operating on packed data, without abandoning idiomatic programming with recursive functions.
Real-world cryptographic code is often written in a subset of C intended to execute in constant-time, thereby avoiding timing side channel vulnerabilities. this C subset eschews structured programming as we know it: i...
详细信息
ISBN:
(纸本)9781450367127
Real-world cryptographic code is often written in a subset of C intended to execute in constant-time, thereby avoiding timing side channel vulnerabilities. this C subset eschews structured programming as we know it: if-statements, looping constructs, and procedural abstractions can leak timing information when handling sensitive data. the resulting obfuscation has led to subtle bugs, even in widely-used high-profile libraries like OpenSSL. To address the challenge of writing constant-time cryptographic code, we present FaCT, a crypto DSL that provides high-level but safe language constructs. the FaCT compiler uses a secrecy type system to automatically transform potentially timing-sensitive high-level code into low-level, constant-time LLVM bitcode. We develop the language and type system, formalize the constant-time transformation, and present an empirical evaluation that uses FaCT to implement core crypto routines from several open-source projects including OpenSSL, libsodium, and curve25519-donna. Our evaluation shows that FaCT's design makes it possible to write readable, high-level cryptographic code, with efficient, constant-time behavior.
the proceedings contain 8 papers. the topics discussed include: fluid data structures;detecting unsatisfiable CSS rules in the presence of DTDs;towards compiling graph queries in relational engines;streaming saturatio...
ISBN:
(纸本)9781450367189
the proceedings contain 8 papers. the topics discussed include: fluid data structures;detecting unsatisfiable CSS rules in the presence of DTDs;towards compiling graph queries in relational engines;streaming saturation for large RDF graphs with dynamic schema information;arc: an IR for batch and stream programming;on the semantics of cypher’s implicit group-by;mixing set and bag semantics;and language-integrated provenance by trace analysis.
Recent advances in machine learning (ML) have produced KiloByte-size models that can directly run on constrained IoT devices. this approach avoids expensive communication between IoT devices and the cloud, thereby ena...
详细信息
ISBN:
(纸本)9781450367127
Recent advances in machine learning (ML) have produced KiloByte-size models that can directly run on constrained IoT devices. this approach avoids expensive communication between IoT devices and the cloud, thereby enabling energy-efficient real-time analytics. However, ML models are expressed typically in floating-point, and IoT hardware typically does not support floating-point. therefore, running these models on IoT devices requires simulating IEEE-754 floating-point using software, which is very inefficient. We present SeeDot, a domain-specific language to express ML inference algorithms and a compiler that compiles SeeDot programs to fixed-point code that can efficiently run on constrained IoT devices. We propose 1) a novel compilation strategy that reduces the search space for some key parameters used in the fixed-point code, and 2) new efficient implementations of expensive operations. SeeDot compiles state-of-the-art KB-sized models to various microcontrollers and low-end FPGAs. We show that SeeDot outperforms 1) software emulation of floating-point (Arduino), 2) high-bitwidth fixed-point (MATLAB), 3) post-training quantization (TensorFlow-Lite), and 4) floating- and fixed-point FPGA implementations generated using high-level synthesis tools.
Automatically transforming programs is hard, yet critical for automated program refactoring, rewriting, and repair. Multi-language syntax transformation is especially hard due to heterogeneous representations in synta...
详细信息
ISBN:
(纸本)9781450367127
Automatically transforming programs is hard, yet critical for automated program refactoring, rewriting, and repair. Multi-language syntax transformation is especially hard due to heterogeneous representations in syntax, parse trees, and abstract syntax trees (ASTs). Our insight is that the problem can be decomposed such that (1) a common grammar expresses the central context-free language (CFL) properties shared by many contemporary languages and (2) open extension points in the grammar allow customizing syntax (e.g., for balanced delimiters) and hooks in smaller parsers to handle language-specific syntax (e.g., for comments). Our key contribution operationalizes this decomposition using a Parser Parser combinator (PPC), a mechanism that generates parsers for matching syntactic fragments in source code by parsing declarative user-supplied templates. this allows our approach to detach from translating input programs to any particular abstract syntax tree representation, and lifts syntax rewriting to a modularly-defined parsing problem. A notable effect is that we skirt the complexity and burden of defining additional translation layers between concrete user input templates and an underlying abstract syntax representation. We demonstrate that these ideas admit efficient and declarative rewrite templates across 12 languages, and validate effectiveness of our approach by producing correct and desirable lightweight transformations on popular real-world projects (over 50 syntactic changes produced by our approach have been merged into 40+). Our declarative rewrite patterns require an order of magnitude less code compared to analog implementations in existing, language-specific tools.
A fundamental challenge in automated reasoning about programming assignments at scale is clustering student submissions based on their underlying algorithms. State-of-the-art clustering techniques are sensitive to con...
详细信息
ISBN:
(纸本)9781450367127
A fundamental challenge in automated reasoning about programming assignments at scale is clustering student submissions based on their underlying algorithms. State-of-the-art clustering techniques are sensitive to control structure variations, cannot cluster buggy solutions with similar correct solutions, and either require expensive pair-wise program analyses or training efforts. We propose a novel technique that can cluster small imperative programs based on their algorithmic essence: (A) how the input space is partitioned into equivalence classes and (B) how the problem is uniquely addressed within individual equivalence classes. We capture these algorithmic aspects as two quantitative semantic program features that are merged into a program's vector representation. Programs are then clustered using their vector representations. the computation of our first semantic feature leverages model counting to identify the number of inputs belonging to an input equivalence class. the computation of our second semantic feature abstracts the program's data flow by tracking the number of occurrences of a unique pair of consecutive values of a variable during its lifetime. the comprehensive evaluation of our tool SemCluster on benchmarks drawn from solutions to small programming assignments shows that SemCluster (1) generates far fewer clusters than other clustering techniques, (2) precisely identifies distinct solution strategies, and (3) boosts the performance of clustering-based program repair, all within a reasonable amount of time.
the proceedings contain 7 papers. the topics discussed include: fixpoint reuse for incremental JavaScript analysis;know your analysis: how instrumentation aids understanding static analysis;SootDiff: bytecode comparis...
ISBN:
(纸本)9781450367202
the proceedings contain 7 papers. the topics discussed include: fixpoint reuse for incremental JavaScript analysis;know your analysis: how instrumentation aids understanding static analysis;SootDiff: bytecode comparison across different Java compilers;modernizing parsing tools: parsing and analysis with object-oriented programming;commit-time incremental analysis;program analysis for process migration;and MetaDL: analyzing datalog in datalog.
the proceedings contain 9 papers. the topics discussed include: finite difference methods fengshui: alignment through a mathematics of arrays;data-parallel flattening by expansion;ALPyNA: acceleration of loops in pyth...
暂无评论