The availability of Large language Models (LLMs) which can generate code, has made it possible to create tools that improve developer productivity. Integrated development environments or IDEs which developers use to w...
详细信息
ISBN:
(数字)9798400712487
ISBN:
(纸本)9798400712487
The availability of Large language Models (LLMs) which can generate code, has made it possible to create tools that improve developer productivity. Integrated development environments or IDEs which developers use to write software are often used as an interface to interact with LLMs. Although many such tools have been released, almost all of them focus on general-purpose programminglanguages. Domain-specific languages, such as those crucial for Information Technology (IT) automation, have not received much attention. Ansible is one such YAML-based IT automation-specific language. Ansible Lightspeed is an LLM-based service designed explicitly to generate Ansible YAML, given natural language prompt. In this paper, we present the design and implementation of the Ansible Lightspeed service. We then evaluate its utility to developers using diverse indicators, including extended utilization, analysis of user edited suggestions, as well as user sentiments analysis. The evaluation is based on data collected for 10,696 real users including 3,910 returning users. The code for Ansible Lightspeed service and the analysis framework is made available for others to use. To our knowledge, our study is the first to involve thousands of users of code assistants for domain-specific languages. We are also the first code completion tool to present N-Day user retention figures, which is 13.66% on Day 30. We propose an improved version of user acceptance rate, called Strong Acceptance rate, where a suggestion is considered accepted only if less than 50% of it is edited and these edits do not change critical parts of the suggestion. By focusing on Ansible, Lightspeed is able to achieve a strong acceptance rate of 49.08% for multi-line Ansible task suggestions. With our findings we provide insights into the effectiveness of small, dedicated models in a domain-specific context. We hope this work serves as a reference for software engineering and machine learning researchers exploring code c
The proceedings contain 148 papers. The topics discussed include: a PLC-based three-dimensional warehouse simulation system;pollen dictionary construction based on supervised comparative learning;application and resea...
ISBN:
(纸本)9798400716942
The proceedings contain 148 papers. The topics discussed include: a PLC-based three-dimensional warehouse simulation system;pollen dictionary construction based on supervised comparative learning;application and research of high-quality pixel streaming architecture based on unreal engine;development and research of generative animation based on AIGC;distributed roof photovoltaics monitoring system for the whole district;a system to increase the accuracy of deep learning based classification by the distribution of labels;fine-tuning of financial large language model and application at edge device;deep learning compiler optimization based on constraint programming;and design and implementation of automatic monitoring software for seismic observation instrument.
We present the Sum-Product Probabilistic language (SPPL), a new probabilistic programminglanguage that automatically delivers exact solutions to a broad range of probabilistic inference queries. SPPL translates proba...
详细信息
ISBN:
(纸本)9781450383912
We present the Sum-Product Probabilistic language (SPPL), a new probabilistic programminglanguage that automatically delivers exact solutions to a broad range of probabilistic inference queries. SPPL translates probabilistic programs into sum-product expressions, a new symbolic representation and associated semantic domain that extends standard sum-product networks to support mixed-type distributions, numeric transformations, logical formulas, and pointwise and set-valued constraints. We formalize SPPL via a novel translation strategy from probabilistic programs to sum-product expressions and give sound exact algorithms for conditioning on and computing probabilities of events. SPPL imposes a collection of restrictions on probabilistic programs to ensure they can be translated into sum-product expressions, which allow the system to leverage new techniques for improving the scalability of translation and inference by automatically exploiting probabilistic structure. We implement a prototype of SPPL with a modular architecture and evaluate it on benchmarks the system targets, showing that it obtains up to 3500x speedups over state-of-the-art symbolic systems on tasks such as verifying the fairness of decision tree classifiers, smoothing hidden Markov models, conditioning transformed random variables, and computing rare event probabilities.
Datalog is becoming increasingly popular as a standard tool for a variety of use cases. Modern Datalog engines can achieve high performance by specializing data structures for relational operations. For example, the D...
详细信息
ISBN:
(纸本)9781450383912
Datalog is becoming increasingly popular as a standard tool for a variety of use cases. Modern Datalog engines can achieve high performance by specializing data structures for relational operations. For example, the Datalog engine Souffle achieves high performance with a synthesizer that specializes data structures for relations. However, the synthesizer cannot always be deployed, and a fast interpreter is required. This work introduces the design and implementation of the Souffle Tree Interpreter (STI). Key for the performance of the STI is the support for fast operations on relations. We obtain fast operations by de-specializing data structures so that they can work in a virtual execution environment. Our new interpreter achieves a competitive performance slowdown between 1.32 and 5.67x when compared to synthesized code. If compile time overheads of the synthesizer are also considered, the interpreter can be 6.46x faster on average for the first run.
Since the introduction of CompCert, researchers have been refining its language semantics and correctness theorem, and used them as components in software verification efforts. Meanwhile, artifacts ranging from CPU de...
详细信息
ISBN:
(纸本)9781450383912
Since the introduction of CompCert, researchers have been refining its language semantics and correctness theorem, and used them as components in software verification efforts. Meanwhile, artifacts ranging from CPU designs to network protocols have been successfully verified, and there is interest in making them interoperable to tackle end-to-end verification at an even larger scale. Recent work shows that a synthesis of game semantics, refinement-based methods, and abstraction layers has the potential to serve as a common theory of certified components. Integrating certified compilers to such a theory is a critical goal. However, none of the existing variants of CompCert meets the requirements we have identified for this task. CompCertO extends the correctness theorem of CompCert to characterize compiled program components directly in terms of their interaction with each other. Through a careful and compositional treatment of calling conventions, this is achieved with minimal effort.
Program synthesis tasks are commonly specified via input-output examples. Existing enumerative techniques for such tasks are primarily guided by program syntax and only make indirect use of the examples. We identify a...
详细信息
ISBN:
(纸本)9781450383912
Program synthesis tasks are commonly specified via input-output examples. Existing enumerative techniques for such tasks are primarily guided by program syntax and only make indirect use of the examples. We identify a class of synthesis algorithms for programming-by-examples, which we call Example-Guided Synthesis (EGS), that exploits latent structure in the provided examples while generating candidate programs. We present an instance of EGS for the synthesis of relational queries and evaluate it on 86 tasks from three application domains: knowledge discovery, program analysis, and database querying. Our evaluation shows that EGS outperforms state-of-the-art synthesizers based on enumerative search, constraint solving, and hybrid techniques in terms of synthesis time, quality of synthesized programs, and ability to prove unrealizability.
Before implementing a function, programmers are encouraged to write a purpose statement i.e., a short, natural-language explanation of what the function computes. A purpose statement may be ambiguous i.e., it may fail...
详细信息
ISBN:
(纸本)9798400708404
Before implementing a function, programmers are encouraged to write a purpose statement i.e., a short, natural-language explanation of what the function computes. A purpose statement may be ambiguous i.e., it may fail to specify the intended behaviour when two or more inequivalent computations are plausible on certain inputs. Our paper makes four contributions. First, we propose a novel heuristic that suggests such inputs using Large language Models (LLMs). Using these suggestions, the programmer may choose to clarify the purpose statement (e.g., by providing a functional example that specifies the intended behaviour on such an input). Second, to assess the quality of inputs suggested by our heuristic, and to facilitate future research, we create an open dataset of purpose statements with known ambiguities. Third, we compare our heuristic against GitHub Copilot's Chat feature, which can suggest similar inputs when prompted to generate unit tests. Fourth, we provide an open-source implementation of our heuristic as an extension to Visual Studio Code for the Python programminglanguage, where purpose statements and functional examples are specified as docstrings and doctests respectively. We believe that this tool will be particularly helpful to novice programmers and instructors.
In quantum computing, the basic unit of information is a qubit. Simulation of a general quantum program takes exponential time in the number of qubits, which makes simulation infeasible beyond 50 qubits on current sup...
详细信息
ISBN:
(纸本)9781450383912
In quantum computing, the basic unit of information is a qubit. Simulation of a general quantum program takes exponential time in the number of qubits, which makes simulation infeasible beyond 50 qubits on current supercomputers. So, for the understanding of larger programs, we turn to static techniques. In this paper, we present an abstract interpretation of quantum programs and we use it to automatically verify assertions in polynomial time. Our key insight is to let an abstract state be a tuple of projections. For such domains, we present abstraction and concretization functions that form a Galois connection and we use them to define abstract operations. Our experiments on a laptop have verified assertions about the Bernstein-Vazirani, GHZ, and Grover benchmarks with 300 qubits.
While Alexa can perform over 100,000 skills, its capability covers only a fraction of what is possible on the web. Individuals need and want to automate a long tail of web-based tasks which often involve visiting diff...
详细信息
ISBN:
(纸本)9781450383912
While Alexa can perform over 100,000 skills, its capability covers only a fraction of what is possible on the web. Individuals need and want to automate a long tail of web-based tasks which often involve visiting different websites and require programming concepts such as function composition, conditional, and iterative evaluation. This paper presents diya (Do-It-Yourself Assistant), a new system that empowers users to create personalized web-based virtual assistant skills that require the full generality of composable control constructs, without having to learn a formal programminglanguage. With DIYA, the user demonstrates their task of interest in the browser and issues a few simple voice commands, such as naming the skills and adding conditions on the action. diya turns these multi-modal specifications into voice-invocable skills written in the ThingTalk 2.0 programminglanguage we designed for this purpose. diya is a prototype that works in the Chrome browser. Our user studies show that 81% of the proposed routines can be expressed using diya. diya is easy to learn, and 80% of users surveyed find diya useful.
Web applications often handle large amounts of sensitive user data. Modern secure web frameworks protect this data by (1) using declarative languages to specify security policies alongside database schemas and (2) aut...
详细信息
ISBN:
(纸本)9781450383912
Web applications often handle large amounts of sensitive user data. Modern secure web frameworks protect this data by (1) using declarative languages to specify security policies alongside database schemas and (2) automatically enforcing these policies at runtime. Unfortunately, these frameworks do not handle the very common situation in which the schemas or the policies need to evolve over time-and updates to schemas and policies need to be performed in a carefully coordinated way. Mistakes during schema or policy migrations can unintentionally leak sensitive data or introduce privilege escalation bugs. In this work, we present a domain-specific language (Scooter) for expressing schema and policy migrations, and an associated SMT-based verifier (Sidecar) which ensures that migrations are secure as the application evolves. We describe the design of Scooter and Sidecar and show that our framework can be used to express realistic schemas, policies, and migrations, without giving up on runtime or verification performance.
暂无评论