Automated program repair can be seen as automated code generation at a micro-scale. The research done in automated program repair is thus particularly relevant today with the movement towards automatic programming usi...
详细信息
ISBN:
(纸本)9798400717673
Automated program repair can be seen as automated code generation at a micro-scale. The research done in automated program repair is thus particularly relevant today with the movement towards automatic programming using tools like Github Copilot. Since automatically generated code from natural language descriptions lack understanding of program semantics, using semantic analysis techniques to auto-correct or rectify the code is of value. In our work we have proposed the use of semantic or symbolic program analysis techniques to automatically rectify code. Effectively symbolic analysis is used to generalize tests into specifications of intent. These techniques can be employed on manually written code as well as automatically generated code. The techniques have been used for security vulnerability repair in software (thereby achieving autonomous cybersecurity) as well as for supporting intelligent tutoring systems. Apart from the practical value of such techniques, conceptually this gave a new direction to use symbolic reasoning. We use symbolic reasoning to derive a logical constraint which would capture what it means for the program to be "correct" thereby inferring specification about intended program behavior. We will conclude with a forward looking perspective on last mile repair of code generated from large language models, as well as acceptable evidences of correctness for such automatically generated code.
Over the past decade, my research has been dedicated to the analysis of mobile applications, with a primary focus on the Android platform. As we delve into the intricacies of mobile app quality and trustworthiness, it...
详细信息
ISBN:
(纸本)9798400717673
Over the past decade, my research has been dedicated to the analysis of mobile applications, with a primary focus on the Android platform. As we delve into the intricacies of mobile app quality and trustworthiness, it becomes imperative to question the prevailing bias in our research toward the Android platform. This keynote talk aims to shed light on the reasons behind this predominant focus and provide insights into the shift of my research paradigm towards the iOS platform. Specifically, the talk will illuminate some of the unique challenges associated with analyzing iOS applications, emphasizing the differences in terms of collecting apps for analysis, the security features, and app deployment mechanisms which make iOS and Android different. I will showcase key findings of a recent prototype that we developed to identify third-party libraries in iOS apps, offering a comparative lens to understand the key differences between these platforms.
Concurrency is fundamental for building scalable software systems. Despite the prevalence of such systems, testing them remains an uncomfortable problem for developers. Concurrency bugs are hard to find, reproduce and...
详细信息
ISBN:
(纸本)9798400717673
Concurrency is fundamental for building scalable software systems. Despite the prevalence of such systems, testing them remains an uncomfortable problem for developers. Concurrency bugs are hard to find, reproduce and fix, and for most part, they get ignored in standard industry practices. To address this need, we have built and deployed an open-source tool called Coyote that uses the concept of Controlled concurrency testing (CCT) to explore the space of possible interleavings of a concurrent program looking for bugs. In this talk, I will describe the (rather long) research journey and the several turns that it took through a whole community of researchers, and finally inspired the Coyote tool in the way that it is designed today. Coyote has been downloaded around a million times and is used routinely for testing of Azure infrastructure services.
Data analysis applications are recently developed by combining multiple programming languages to utilize the advantage of each of them. Such applications are called multilingual applications and can accelerate data-in...
详细信息
Log parsing, which extracts log templates and parameters, is a critical prerequisite step for automated log analysis techniques. Though existing log parsers have achieved promising accuracy on public log datasets, the...
详细信息
ISBN:
(纸本)9798400717673
Log parsing, which extracts log templates and parameters, is a critical prerequisite step for automated log analysis techniques. Though existing log parsers have achieved promising accuracy on public log datasets, they still face many challenges when applied in the industry. Through studying the characteristics of real-world log data and analyzing the limitations of existing log parsers, we identify two problems. Firstly, it is non-trivial to scale a log parser to a vast number of logs, especially in real-world scenarios where the log data is extremely imbalanced. Secondly, existing log parsers overlook the importance of user feedback, which is imperative for parser fine-tuning under the continuous evolution of log data. To overcome the challenges, we propose SPINE, which is a highly scalable log parser with user feedback guidance. Based on our log parser equipped with initial grouping and progressive clustering, we propose a novel log data scheduling algorithm to improve the efficiency of parallelization under the large-scale imbalanced log data. Besides, we introduce user feedback to make the parser fast adapt to the evolving logs. We evaluated SPINE on 16 public log datasets. SPINE achieves more than 0.90 parsing accuracy on average with the highest parsing efficiency, which outperforms the state-of-the-art log parsers. We also evaluated SPINE in the production environment of Microsoft, in which SPINE can parse 30 million logs in less than 8 minutes under 16 executors, achieving near real-time performance. In addition, our evaluations show that SPINE can consistently achieve good accuracy under log evolution with a moderate number of user feedback.
Patients with acute ischemic stroke can benefit from reperfusion therapy. Nevertheless, there are gray areas where initiation of reperfusion therapy is neither supported nor contraindicated by the current practice gui...
详细信息
Patients with acute ischemic stroke can benefit from reperfusion therapy. Nevertheless, there are gray areas where initiation of reperfusion therapy is neither supported nor contraindicated by the current practice guidelines. In these situations, a prediction model for mortality can be beneficial in decision-making. This study aimed to develop a mortality prediction model for acute ischemic stroke patients not receiving reperfusion therapies using a stacking ensemble learning model. The model used an artificial neural network as an ensemble classifier. Seven base classifiers were K-nearest neighbors, support vector machine, extreme gradient boosting, random forest, naive Bayes, artificial neural network, and logistic regression algorithms. From the clinical data in the International Stroke Trial database, we selected a concise set of variables assessable at the presentation. The primary study outcome was all-cause mortality at 6 months. Our stacking ensemble model predicted 6-month mortality with acceptable performance in ischemic stroke patients not receiving reperfusion therapy. The area under the curve of receiver-operating characteristics, accuracy, sensitivity, and specificity of the stacking ensemble classifier on a put-aside validation set were 0.783 (95% confidence interval 0.758-0.808), 71.6% (69.3-74.2), 72.3% (69.2-76.4%), and 70.9% (68.9-74.3%), respectively.
We present Bronco: an in-development authoring language for Turing-complete procedural text generation. Our language emerged from a close examination of existing tools. This analysis led to our desire of supporting us...
详细信息
ISBN:
(纸本)9783031222979;9783031222986
We present Bronco: an in-development authoring language for Turing-complete procedural text generation. Our language emerged from a close examination of existing tools. This analysis led to our desire of supporting users in specifying yielding grammars, a formalism we invented that is more expressive than what several popular and available solutions offer. With this formalism as our basis, we detail the qualities of Bronco that expose its power in author-focused ways.
The limited space health intelligence sensing program is developed, the main function is to coordinate the startup and work of each hardware module, and obtain and transmit the monitoring data of operators. By using f...
详细信息
Data privacy is a key concern for smart contracts handling sensitive data. The existing work zkay addresses this concern by allowing developers without cryptographic expertise to enforce data privacy. However, while z...
详细信息
ISBN:
(数字)9781665413169
ISBN:
(纸本)9781665413169
Data privacy is a key concern for smart contracts handling sensitive data. The existing work zkay addresses this concern by allowing developers without cryptographic expertise to enforce data privacy. However, while zkay avoids fundamental limitations of other private smart contract systems, it cannot express key applications that involve operations on foreign data. We present ZeeStar, a language and compiler allowing non-experts to instantiate private smart contracts and supporting operations on foreign data. The ZeeStar language allows developers to ergonomically specify privacy constraints using zkay's privacy annotations. The ZeeStar compiler then provably realizes these constraints by combining non-interactive zero-knowledge proofs and additively homomorphic encryption. We implemented ZeeStar for the public blockchain Ethereum. We demonstrated its expressiveness by encoding 12 example contracts, including oblivious transfer and a private payment system like Zether. ZeeStar is practical: it prepares transactions for our contracts in at most 54.7 s, at an average cost of 339 k gas.
Background Learning to code is increasingly embedded in secondary and higher education curricula, where solving programming exercises plays an important role in the learning process and in formative and summative asse...
详细信息
Background Learning to code is increasingly embedded in secondary and higher education curricula, where solving programming exercises plays an important role in the learning process and in formative and summative assessment. Unfortunately, students admit that copying code from each other is a common practice and teachers indicate they rarely use plagiarism detection tools. Objectives We want to lower the barrier for teachers to detect plagiarism by introducing a new source code plagiarism detection tool (Dolos) that is powered by state-of-the art similarity detection algorithms, offers interactive visualizations, and uses generic parser models to support a broad range of programming languages. Methods Dolos is compared with state-of-the-art plagiarism detection tools in a benchmark based on a standardized dataset. We describe our experience with integrating Dolos in a programming course with a strong focus on online learning and the impact of transitioning to remote assessment during the COVID-19 pandemic. Results and Conclusions Dolos outperforms other plagiarism detection tools in detecting potential cases of plagiarism and is a valuable tool for preventing and detecting plagiarism in online learning environments. It is available under the permissive MIT open-source license at . Implications Dolos lowers barriers for teachers to discover, prove and prevent plagiarism in programming courses. This helps to enable a shift towards open and online learning and assessment environments, and opens up interesting avenues for more effective learning and better assessment.
暂无评论