Adaptive optics systems are usually prototyped in a convenient but slow language like MATLAB or Python, and then re-written from scratch using high-performance C/C++ to perform real-time control. This duplication of e...
详细信息
ISBN:
(纸本)9781510675186;9781510675179
Adaptive optics systems are usually prototyped in a convenient but slow language like MATLAB or Python, and then re-written from scratch using high-performance C/C++ to perform real-time control. This duplication of effort adds costs and slows the experimentation process. We present instead a technical demonstration of performing real time, sub-millisecond latency control with an adaptive optics system using the high-level Julia programming language. This open-source software demonstrates support for multiple cameras, pixel streaming, and network-transparency distributedcomputing. Furthermore, it is easy to interface it with other programminglanguages as desired.
Worked examples, which illustrate the process for solving a problem step-by-step, are a well-established pedagogical technique that has been widely studied in computing classrooms. However, creating high-quality worke...
详细信息
ISBN:
(纸本)9798400716195
Worked examples, which illustrate the process for solving a problem step-by-step, are a well-established pedagogical technique that has been widely studied in computing classrooms. However, creating high-quality worked examples is very time-intensive for educators, and thus learners tend not to have access to a broad range of such examples. The recent emergence of powerful large language models (LLMs), which appear capable of generating high-quality humanlike content, may offer a solution. Separate strands of recent work have shown that LLMs can accurately generate code suitable for a novice audience, and that they can generate high-quality explanations of code. Therefore, LLMs may be well suited to creating a broad range of worked examples, overcoming the bottleneck of manual effort that is currently required. In this work, we present a novel tool, 'WorkedGen', which uses an LLM to generate interactive worked examples. We evaluate this tool with both an expert assessment of the content, and a user study involving students in a first-year Python programming course ( n = similar to 400). We find that prompt chaining and one-shot learning are useful strategies for optimising the output of an LLM when producing worked examples. Our expert analysis suggests that LLMs generate clear explanations, and our classroom deployment revealed that students find the LLM-generated worked examples useful for their learning. We propose several avenues for future work, including investigating WorkedGen's value in a range of programminglanguages, and with more complex questions suitable for more advanced courses.
Sparse matrix dense matrix multiplication (SpMM) is commonly used in applications ranging from scientific computing to graph neural networks. Typically, when SpMM is executed in a distributed platform, communication c...
详细信息
ISBN:
(纸本)9798400703850
Sparse matrix dense matrix multiplication (SpMM) is commonly used in applications ranging from scientific computing to graph neural networks. Typically, when SpMM is executed in a distributed platform, communication costs dominate. Such costs depend on how communication is scheduled. If it is scheduled in a sparsity-unaware manner, such as with collectives, execution is often inefficient due to unnecessary data transfers. On the other hand, if communication is scheduled in a fine-grained sparsity-aware manner, communicating only the necessary data, execution can also be inefficient due to high software overhead. We observe that individual sparse matrices often contain regions that are denser and regions that are sparser. Based on this observation, we develop a model that partitions communication into sparsity-unaware and sparsity-aware components. Leveraging the partition, we develop a new algorithm that performs collective communication for the denser regions, and fine-grained, one-sided communication for the sparser regions. We call the algorithm Two-Face. We show that Two-Face attains an average speedup of 2.11x over prior work when evaluated on a 4096-core supercomputer. Additionally, Two-Face scales well with the machine size.
Expressive state-of-the-art separation logics rely on step-indexing to model semantically complex features and to support modular reasoning about imperative higher-order concurrent and distributed programs. Step-index...
详细信息
Expressive state-of-the-art separation logics rely on step-indexing to model semantically complex features and to support modular reasoning about imperative higher-order concurrent and distributed programs. Step-indexing comes, however, with an inherent cost: it restricts the adequacy theorem of program logics to a fairly simple class of safety properties. In this paper, we explore if and how intensional refinement is a viable methodology for strengthening higher-order concurrent (and distributed) separation logic to prove non-trivial safety and liveness properties. Specifically, we introduce Trillium, a language-agnostic separation logic framework for showing intensional refinement relations between traces of a program and a model. We instantiate Trillium with a concurrent language and develop Fairis, a concurrent separation logic, that we use to show liveness properties of concurrent programs under fair scheduling assumptions through a fair liveness-preserving refinement of a model. We also instantiate Trillium with a distributed language and obtain an extension of Aneris, a distributed separation logic, which we use to show refinement relations between distributed systems and TLA(+) models.
Byzantine fault-tolerant state machine replication (SMR) protocols, such as PBFT, HotStuff, and Jolteon, are essential for modern blockchain technologies. However, they are challenging to implement correctly because t...
详细信息
Byzantine fault-tolerant state machine replication (SMR) protocols, such as PBFT, HotStuff, and Jolteon, are essential for modern blockchain technologies. However, they are challenging to implement correctly because they have to deal with any unexpected message from Byzantine peers and ensure safety and liveness at all times. Many formal frameworks have been developed to verify the safety of SMR implementations, but there is still a gap in the verification of their liveness. Existing liveness proofs are either limited to the network level or do not cover popular partially synchronous protocols. We introduce LiDO, a consensus model that enables the verification of both safety and liveness of implementations through refinement. We observe that current consensus models cannot handle liveness because they do not include a pacemaker state. We show that by adding a pacemaker state to the LiDO model, we can express the liveness properties of SMR protocols as a few safety properties that can be easily verified by refinement proofs. Based on our LiDO model, we provide mechanized safety and liveness proofs for both unpipelined and pipelined Jolteon in Coq. This is the first mechanized liveness proof for a byzantine consensus protocol with non-trivial optimizations such as pipelining.
The discussion around "safe"programminglanguages has significantly increased in recent years, and is impacting how governments, industry, and academia plan to develop current and future software products. T...
详细信息
A million qubit-scale quantum computer is essential to realize the quantum supremacy. Modern large-scale quantum computers integrate multiple quantum computers located in dilution refrigerators (DR) to overcome each D...
详细信息
ISBN:
(纸本)9798400703850
A million qubit-scale quantum computer is essential to realize the quantum supremacy. Modern large-scale quantum computers integrate multiple quantum computers located in dilution refrigerators (DR) to overcome each DR's unscaling cooling budget. However, a large-scale multi-DR quantum computer introduces its unique challenges (i.e., slow and erroneous inter-DR entanglement, increased qubit scale), and they make the baseline error handling mechanism ineffective by increasing the number of gate operations and the interDR communication latency to decode and correct errors. Without resolving these challenges, it is impossible to realize a fault-tolerant large-scale multi-DR quantum computer. In this paper, we propose a million qubit-scale distributed quantum computer which uses a novel error handling mechanism enabling fault-tolerant multi-DR quantum computing. First, we apply a low-overhead multi-DR error syndrome measurement (ESM) sequence to reduce both the number of gate operations and the error rate. Second, we apply a scalable multi-DR error decoding unit (EDU) architecture to maximize both the decoding speed and accuracy. Our multiDR error handling SW-HW co-design improves the ESM latency, ESM errors, EDU latency, and EDU accuracy by 3.7 times, 2.4 times, 685 times, and 6.1 center dot 10(10) times, respectively. With our scheme applied to assumed voltage-scaled CMOS and mature ERSFQ technologies, we successfully build a fault-tolerant million qubit-scale quantum computer.
Online GNN inference has been widely explored by applications such as online recommendation and financial fraud detection systems, where even minor delays can result in significant financial impact. Real-time dynamic ...
详细信息
ISBN:
(纸本)9798400714436
Online GNN inference has been widely explored by applications such as online recommendation and financial fraud detection systems, where even minor delays can result in significant financial impact. Real-time dynamic graph sampling enables online GNN inference to reflect the latest graph updates in real-world graphs. However, online GNN inference typically demands millisecond-level latency Service Level Objectives (SLOs) as its performance guarantees, which poses great challenges for existing dynamic graph sampling approaches based on graph databases. The issues mainly arise from two aspects: long tail latency due to imbalanced data-dependent sampling and large communication overhead incurred by distributed sampling. To address these issues, we propose Helios, an efficient distributed dynamic graph sampling service to meet the stringent latency SLOs. The key ideas of Helios are 1) pre-sampling the dynamic graph in an event-driven approach, and 2) maintaining a query-aware sample cache to build the complete K-hop sampling results locally for inference requests. Experiments on multiple datasets show that Helios achieves up to 67x higher serving throughput and up to 32x lower P99 query latency compared to baselines.
The rapid proliferation of LLM-based programming assistants has enabled fast and accurate automatic code generation for general purpose programminglanguages. Domain-specific languages like Ansible, a DSL for IT Autom...
详细信息
ISBN:
(纸本)9798400705649
The rapid proliferation of LLM-based programming assistants has enabled fast and accurate automatic code generation for general purpose programminglanguages. Domain-specific languages like Ansible, a DSL for IT Automation, have seen a lack of support despite being critical to many fields, due to limited public-domain code for training models and a lack of interest from tool developers. To address this issue, we collect a novel dataset of permissively licensed Ansible code, and use it to create Warp, an LLM for code fine-tuned to produce Ansible tasks from a natural language prompt. We evaluate state-of-the-art tools for LLM-based code generation models, comparing multiple common strategies, including fine-tuning base models on Ansible code and retrieval-augmented-generation using documentation, in order to understand challenges with existing methodology and identify future research directions to enable better code generation for DSLs.
In introductory programming, students must develop an accurate mental model of how programminglanguages work. This model, often called a 'notional machine,' is essential for understanding how a machine interp...
详细信息
ISBN:
(纸本)9798400709975
In introductory programming, students must develop an accurate mental model of how programminglanguages work. This model, often called a 'notional machine,' is essential for understanding how a machine interprets and executes code. Existing research highlights the importance of building effective and accurate mental models through code-tracing activities and tools like code visualizations. However, effectively integrating such tools into post-secondary classes remains challenging, especially in large classroom settings. To address this, we have developed Code Diagram Queries (CDQs) for introductory programming courses to help students build mental models of programming language notional machines. CDQs are questions incorporating diagrammatic representations of code at various execution stages to foster student engagement and comprehension of how the code is executed. CDQs were designed to challenge and refine student mental models of code execution. The effectiveness of these CDQs was assessed in an introductory Python programming course, where students in one section engaged with CDQ-based normative assessments (n=94) and students in a control section engaged with non-CDQ normative assessments (n=82). Through comparative evaluations of course performance and visualization engagement, as well as qualitative interview responses, we found preliminary evidence that CDQs helped identify and clarify misconceptions around abstract programming concepts.
暂无评论