In a typical data-processing program, the representation of data in memory is distinct from its representation in a serialized form on disk. the former has pointers and arbitrary, sparse layout, facilitating easy mani...
详细信息
ISBN:
(纸本)9781450367127
In a typical data-processing program, the representation of data in memory is distinct from its representation in a serialized form on disk. the former has pointers and arbitrary, sparse layout, facilitating easy manipulation by a program, while the latter is packed contiguously, facilitating easy I/O. We propose a language, LoCal, to unify in-memory and serialized formats. LoCal extends a region calculus into a location calculus, employing a type system that tracks the byte-addressed layout of all heap values. We formalize LoCal and prove type safety, and show how LoCal programs can be inferred from unannotated source terms. We transform the existing Gibbon compiler to use LoCal as an intermediate language, withthe goal of achieving a balance between code speed and data compactness by introducing just enough indirection into heap layouts, preserving the asymptotic complexity of traditional representations, but working with mostly or completely serialized data. We show that our approach yields significant performance improvement over prior approaches to operating on packed data, without abandoning idiomatic programming with recursive functions.
A fundamental challenge in automated reasoning about programming assignments at scale is clustering student submissions based on their underlying algorithms. State-of-the-art clustering techniques are sensitive to con...
详细信息
ISBN:
(纸本)9781450367127
A fundamental challenge in automated reasoning about programming assignments at scale is clustering student submissions based on their underlying algorithms. State-of-the-art clustering techniques are sensitive to control structure variations, cannot cluster buggy solutions with similar correct solutions, and either require expensive pair-wise program analyses or training efforts. We propose a novel technique that can cluster small imperative programs based on their algorithmic essence: (A) how the input space is partitioned into equivalence classes and (B) how the problem is uniquely addressed within individual equivalence classes. We capture these algorithmic aspects as two quantitative semantic program features that are merged into a program's vector representation. Programs are then clustered using their vector representations. the computation of our first semantic feature leverages model counting to identify the number of inputs belonging to an input equivalence class. the computation of our second semantic feature abstracts the program's data flow by tracking the number of occurrences of a unique pair of consecutive values of a variable during its lifetime. the comprehensive evaluation of our tool SemCluster on benchmarks drawn from solutions to small programming assignments shows that SemCluster (1) generates far fewer clusters than other clustering techniques, (2) precisely identifies distinct solution strategies, and (3) boosts the performance of clustering-based program repair, all within a reasonable amount of time.
Real-world cryptographic code is often written in a subset of C intended to execute in constant-time, thereby avoiding timing side channel vulnerabilities. this C subset eschews structured programming as we know it: i...
详细信息
ISBN:
(纸本)9781450367127
Real-world cryptographic code is often written in a subset of C intended to execute in constant-time, thereby avoiding timing side channel vulnerabilities. this C subset eschews structured programming as we know it: if-statements, looping constructs, and procedural abstractions can leak timing information when handling sensitive data. the resulting obfuscation has led to subtle bugs, even in widely-used high-profile libraries like OpenSSL. To address the challenge of writing constant-time cryptographic code, we present FaCT, a crypto DSL that provides high-level but safe language constructs. the FaCT compiler uses a secrecy type system to automatically transform potentially timing-sensitive high-level code into low-level, constant-time LLVM bitcode. We develop the language and type system, formalize the constant-time transformation, and present an empirical evaluation that uses FaCT to implement core crypto routines from several open-source projects including OpenSSL, libsodium, and curve25519-donna. Our evaluation shows that FaCT's design makes it possible to write readable, high-level cryptographic code, with efficient, constant-time behavior.
Recent advances in machine learning (ML) have produced KiloByte-size models that can directly run on constrained IoT devices. this approach avoids expensive communication between IoT devices and the cloud, thereby ena...
详细信息
ISBN:
(纸本)9781450367127
Recent advances in machine learning (ML) have produced KiloByte-size models that can directly run on constrained IoT devices. this approach avoids expensive communication between IoT devices and the cloud, thereby enabling energy-efficient real-time analytics. However, ML models are expressed typically in floating-point, and IoT hardware typically does not support floating-point. therefore, running these models on IoT devices requires simulating IEEE-754 floating-point using software, which is very inefficient. We present SeeDot, a domain-specific language to express ML inference algorithms and a compiler that compiles SeeDot programs to fixed-point code that can efficiently run on constrained IoT devices. We propose 1) a novel compilation strategy that reduces the search space for some key parameters used in the fixed-point code, and 2) new efficient implementations of expensive operations. SeeDot compiles state-of-the-art KB-sized models to various microcontrollers and low-end FPGAs. We show that SeeDot outperforms 1) software emulation of floating-point (Arduino), 2) high-bitwidth fixed-point (MATLAB), 3) post-training quantization (TensorFlow-Lite), and 4) floating- and fixed-point FPGA implementations generated using high-level synthesis tools.
Most traditional software systems are not built withthe artificial intelligence support (AI) in mind. Among them, some may require human interventions to operate, e.g., the manual specification of the parameters in t...
详细信息
ISBN:
(纸本)9781450367127
Most traditional software systems are not built withthe artificial intelligence support (AI) in mind. Among them, some may require human interventions to operate, e.g., the manual specification of the parameters in the data processing programs, or otherwise, would behave poorly. We propose a novel framework called Autonomizer to autonomize these systems by installing the AI into the traditional programs. Autonomizer is general so it can be applied to many real-world applications. We provide the primitives and the runtime support, where the primitives abstract common tasks of autonomization and the runtime support realizes them transparently. Withthe support of Autonomizer, the users can gain the AI support with little engineering efforts. Like many other AI applications, the challenge lies in the feature selection, which we address by proposing multiple automated strategies based on the program analysis. Our experiment results on nine real-world applications show that the autonomization only requires adding a few lines to the source code. Besides, for the data-processing programs, Autonomizer improves the output quality by 161% on average over the default settings. For the interactive programs such as game/driving, Autonomizer achieves higher success rate with lower training time than existing autonomized programs.
this paper presents an algorithm for synthesizing recursive functions that process algebraic datatypes. It is founded on proof-theoretic techniques that exploit both type information and input-output examples to prune...
详细信息
ISBN:
(纸本)9781450334686
this paper presents an algorithm for synthesizing recursive functions that process algebraic datatypes. It is founded on proof-theoretic techniques that exploit both type information and input-output examples to prune the search space. the algorithm uses refinement trees, a data structure that succinctly represents constraints on the shape of generated code. We evaluate the algorithm by using a prototype implementation to synthesize more than 40 benchmarks and several non-trivial larger examples. Our results demonstrate that the approach meets or outperforms the state-of-the-art for this domain, in terms of synthesis time or attainable size of the generated programs.
the proceedings contain 8 papers. the topics discussed include: fluid data structures;detecting unsatisfiable CSS rules in the presence of DTDs;towards compiling graph queries in relational engines;streaming saturatio...
ISBN:
(纸本)9781450367189
the proceedings contain 8 papers. the topics discussed include: fluid data structures;detecting unsatisfiable CSS rules in the presence of DTDs;towards compiling graph queries in relational engines;streaming saturation for large RDF graphs with dynamic schema information;arc: an IR for batch and stream programming;on the semantics of cypher’s implicit group-by;mixing set and bag semantics;and language-integrated provenance by trace analysis.
the proceedings contain 7 papers. the topics discussed include: fixpoint reuse for incremental JavaScript analysis;know your analysis: how instrumentation aids understanding static analysis;SootDiff: bytecode comparis...
ISBN:
(纸本)9781450367202
the proceedings contain 7 papers. the topics discussed include: fixpoint reuse for incremental JavaScript analysis;know your analysis: how instrumentation aids understanding static analysis;SootDiff: bytecode comparison across different Java compilers;modernizing parsing tools: parsing and analysis with object-oriented programming;commit-time incremental analysis;program analysis for process migration;and MetaDL: analyzing datalog in datalog.
the proceedings contain 9 papers. the topics discussed include: finite difference methods fengshui: alignment through a mathematics of arrays;data-parallel flattening by expansion;ALPyNA: acceleration of loops in pyth...
the proceedings contain 8 papers. the topics discussed include: GPUIterator: bridging the gap between chapel and GPU platforms;calling chapel code: interoperability improvements;towards radix sorting in the chapel sta...
ISBN:
(纸本)9781450368001
the proceedings contain 8 papers. the topics discussed include: GPUIterator: bridging the gap between chapel and GPU platforms;calling chapel code: interoperability improvements;towards radix sorting in the chapel standard library;implementing stencil problems in chapel: an experience report;chapel unblocked: recent communication optimizations in chapel;Arkouda: Interactive data exploration backed by chapel;chapel graph library (CGL);and chapel in cray HPO.
暂无评论